All terms
Training
Pretraining
The initial, compute-heavy stage where a model learns broad patterns from a huge unlabeled dataset.
Definition
Pretraining is the first and most compute-intensive stage of building a model, where it learns broad language or visual patterns by predicting parts of a massive unlabeled dataset (often trillions of tokens) with a self-supervised objective like next-token prediction. The result is a base model with wide world knowledge that later stages refine — through fine-tuning or alignment — into a useful assistant.