AI Glossary: Letter "Q"
Explore definitions and dynamic coverage analytics for the core concepts shaping artificial intelligence.
Q
QLoRA
Quantized Low-Rank Adaptation (QLoRA) is an advanced parameter-efficient fine-tuning (PEFT) technique that runs LoRA over a base model quantized to 4-bit precision. It uses special formats like NormalFloat4 to maintain model accuracy while drastically reducing VRAM overhead.
Model TrainingRead Term
Quantization
Quantization is the process of compressing neural network parameters by reducing the numerical precision of its weights (e.g. converting 16-bit floating points to 4-bit integers), lowering VRAM requirements and accelerating inference.
Model OperationsRead Term