All terms
Data
Training Data
The examples a model learns from while its weights are adjusted.
Definition
Training data is the body of examples — text, images, audio, code, and more — that a model learns from while its weights are adjusted, kept separate from the validation and test data used to measure performance. Its scale, quality, diversity, and curation largely determine a model's capabilities and limitations. Foundation models are trained on web-scale collections and synthetic generation.