Training

LoRA

A cheap fine-tuning method that trains small add-on weight matrices.

Definition

LoRA is a parameter-efficient fine-tuning technique (it customizes a model while training only a tiny fraction of it) that freezes the original model weights and adds small trainable matrices to each layer. Only those tiny matrices are trained, slashing memory and compute cost while reaching quality close to full fine-tuning. Multiple LoRA 'adapters' (these add-on matrices) can be swapped on the same base model.

Related terms

Fine-Tuning Quantization LLM