Definition
Quantization
Quantization is the process of compressing neural network parameters by reducing the numerical precision of its weights (e.g. converting 16-bit floating points to 4-bit integers), lowering VRAM requirements and accelerating inference.
Frequently Asked Questions
Does quantization degrade model performance?▼
It can cause minor degradation in accuracy, but modern quantization algorithms (like GPTQ or AWQ) minimize this loss while reducing model size by up to 75%.
What are GGUF and EXL2?▼
Lightweight quantized file formats optimized to run LLMs locally on CPUs/Macs (GGUF) or GPUs (EXL2).
Quantization Media Coverage & Intelligence
arXiv AIJun 6, 2026
Minimizing the Hidden Cost of Scales: Graph-Guided Ultra-Low-Bit Quantization for Large Language Models
Post-training quantization (PTQ) is critical for the efficient deployment of large language models (LLMs). Recen