Skip to main content
All terms
Inference & Serving

Sampling

Drawing the next token randomly from a model's predicted probability distribution.

Definition

Sampling is how a model picks the next word: it treats the list of probabilities it assigns to possible next words as a weighted lottery and draws one at random. Settings such as temperature, top-p, top-k, and min-p reshape those odds before the draw, balancing variety against reliability. It produces more diverse and creative text than always taking the single most likely word, at the cost of giving different answers each time, and underlies most everyday use.