Skip to main content
All terms
Architectures

Autoregressive Model

A model that generates a sequence one element at a time, conditioning each step on all prior ones.

Definition

An autoregressive model predicts each element of a sequence from the elements that came before it. For language models, this means generating one token at a time: the model reads the prompt plus everything generated so far and outputs a probability distribution over the next token. GPT-style decoder-only Transformers are the dominant example. The step-by-step nature of this generation is a key cause of inference latency.