Architectures

Encoder-Decoder

An architecture that reads input with one stack and generates output with another.

Definition

An encoder-decoder model uses one stack to read the input and build a contextual representation, and a second stack to generate the output while attending to that representation through cross-attention. Also called seq2seq, this design was used in the original 2017 Transformer and suits tasks where input and output differ in form, such as translation and summarization. T5 and BART are prominent examples; many modern language models keep only the decoder.

Encoder-Decoder

Definition

Related terms