Skip to main content
All terms
Inference & Serving

Decoding Strategy

The rule for choosing each next token from a model's predicted distribution.

Definition

A decoding strategy is the rule that selects each next token (word-piece) from the list of likelihoods a model predicts, shaping how creative, coherent, or predictable the output is. Greedy decoding always takes the most likely option, while sampling methods like top-k and top-p draw at random from a restricted pool of high-probability choices. Temperature and beam search offer further control over diversity and structure.