GGUF / Safetensors Model File Inspector
Inspect GGUF and Safetensors model files in your browser. See architecture, quantization, tensor details, and parameter counts without uploading anything.
About GGUF / Safetensors Model File Inspector
When you download a GGUF or Safetensors model, you often want to confirm what is inside before loading it into an inference engine. This tool lets you drag and drop a model file and instantly see its metadata, architecture, and tensor layout without uploading anything.
How it works
The inspector reads only the header portion of your file using the browser's File API. For GGUF files, it parses the binary header to extract metadata key-value pairs, including the model architecture, quantization type, context window, and tokenizer vocabulary size. For Safetensors, it reads the JSON header to list every tensor with its shape, dtype, and byte offset. The rest of the file is never touched, so even 70B models load instantly. If you want to estimate how much VRAM a model needs before running it, try the AI VRAM calculator.
GGUF metadata explained
GGUF is the standard format for llama.cpp and similar inference engines. The header encodes the model's architecture (llama, mistral, phi, etc.), quantization level (Q4_K_M, Q5_K_S, Q8_0, and others), context length, embedding dimensions, layer count, and attention head configuration. The tool also shows tokenizer details, including the vocabulary size and tokenizer model type.
Safetensors tensor inspection
Safetensors files store a JSON manifest at the beginning that describes every tensor. The inspector presents this as a searchable table where you can filter tensors by name, see their shapes and data types, and get an exact parameter count. This is useful for verifying model structure or comparing different checkpoints. For understanding how token counts affect API costs, check the AI token counter.
Everything runs locally in your browser. No file data is ever sent to a server.
Frequently Asked Questions
Does this tool upload my model file?
No. The file never leaves your browser. The tool reads only the header section of your file (up to 10MB for GGUF, just the JSON header for Safetensors) using the File API. No data is sent to any server.
What information can I extract from a GGUF file?
GGUF files contain rich metadata in their header: model architecture, quantization type, context length, embedding dimensions, layer count, attention head configuration, tokenizer details, and vocabulary size. This tool parses all of it and presents it in organized sections.
What about Safetensors files?
Safetensors files store a JSON header with the full tensor manifest. This tool reads that header and shows you every tensor's name, shape, data type, and byte size, along with the total parameter count and a dtype breakdown.
Can I inspect very large model files (50GB+)?
Yes. The tool only reads a small slice from the beginning of the file. For GGUF files it reads up to 10MB, and for Safetensors it reads just the header. The rest of the file is never touched, so even multi-gigabyte models parse in under a second.
What GGUF versions are supported?
The parser supports GGUF version 2 and version 3, which covers all models from llama.cpp and related tooling. If a file uses an unsupported version, you will see a clear error message.