Skip to main content
All terms
Inference & Serving

Top-k Sampling

Sampling the next token only from the k most likely candidates.

Definition

Top-k sampling keeps only the k most likely next words, rescales their odds so they add up to one again, and then picks randomly from that smaller set. It is simple to implement and keeps very unlikely tokens out of consideration. A fixed k can be too restrictive when the distribution is flat and too permissive when it is sharp, which is why adaptive alternatives like top-p and min-p adjust the cutoff to the distribution's shape.