All terms
Foundations
Context Window
The maximum number of tokens a model can consider at once.
Definition
The context window is the maximum span of tokens (the chunks of text a model reads, roughly word-sized) a model can consider at once, covering both the prompt and the generated output. Larger windows let a model reason over more documents or longer conversations, but cost grows with length because the model's running memory of the text and the work of weighing every part against every other part both scale with it.