All terms
Architectures
Autoregressive Model
A model that generates a sequence one element at a time, conditioning each step on all prior ones.
Definition
An autoregressive model predicts each element of a sequence from the elements that came before it. For language models, this means generating one token at a time: the model reads the prompt plus everything generated so far and outputs a probability distribution over the next token. GPT-style decoder-only Transformers are the dominant example. The step-by-step nature of this generation is a key cause of inference latency.