Skip to main content
All terms
Inference & Serving

Stop Sequence

A string or token pattern that tells the model to end generation when it appears.

Definition

A stop sequence is a string or token pattern that ends generation as soon as the model produces it. It gives applications precise control over where a response stops — for example, halting at a delimiter, a role marker in a chat template, or a closing tag. The matched sequence is usually excluded from the returned text. Stop sequences are a standard parameter in serving APIs.