Architectures

Positional Encoding

Signals added to token representations so a model knows their order in a sequence.

Definition

Positional Encoding adds information about each token's position to its representation, because attention otherwise treats the input as an unordered set. The original Transformer used fixed sinusoidal patterns; modern models favor learned or relative schemes such as rotary embeddings (RoPE) and ALiBi (a method that gently down-weights far-apart tokens). These newer methods cope better with sequences longer than those seen in training and integrate more naturally with attention.

Positional Encoding

Definition

Related terms