Skip to main content
All terms
Inference & Serving

Test-Time Compute

Spending extra computation at inference, not just training, to improve answers.

Definition

Test-time compute refers to using extra computation when answering a query rather than only during training. Examples include generating long chains of reasoning before responding, or sampling many candidate answers and selecting among them. Spending more compute at inference can raise accuracy on hard problems, which makes it a central lever behind reasoning models that deliberate before producing a final answer.