Skip to main content
All terms
Inference & Serving

N-Gram Matching

Drafting candidate tokens by looking up short word patterns in recent text.

Definition

N-gram matching proposes candidate words by looking up short recurring word sequences (an n-gram is a run of n consecutive words) in the prompt or recently generated text. It needs almost no extra compute or memory and can be combined with more sophisticated guess-ahead methods. Though simple, it gives meaningful speedups on repetitive workloads and is easy to add on top of an existing serving system.