Inference & Serving

Logits

The raw per-token scores a model outputs before softmax turns them into probabilities.

Definition

Logits are the raw scores a language model produces at its final layer, one real value per token in the vocabulary. They are not probabilities; they become a probability distribution only after passing through a softmax. Inference-time controls such as temperature, top-k, top-p, and repetition penalty all operate on the logits before softmax, reshaping the distribution that sampling then draws from.

Logits

Definition

Related terms