All terms
Evaluation
Test Set
Held-out data used for final evaluation after a model is trained and tuned.
Definition
A test set is a portion of data held out from training and tuning, used only for a final, unbiased estimate of how a model performs on unseen examples. Keeping it separate from the validation set prevents the reported score from being inflated by choices made during development. Reusing or leaking it into training causes contamination and overstated results.