Skip to main content
All terms
Foundations

Token

The chunk of text (word, sub-word, or character) a model reads and generates.

Definition

A token is the basic unit of text a language model processes. A tokenizer splits input into tokens — often sub-word pieces, so 'tokenization' might become 'token' + 'ization'. Models have a fixed vocabulary of tokens, and both context limits and API pricing are usually measured in tokens rather than words.