Skip to main content
All terms
Evaluation

GAIA

A benchmark for general AI assistants on realistic, multi-step tasks needing tools.

Definition

GAIA is a benchmark for general-purpose AI assistants built around practical, open-ended tasks that people can solve but that require web browsing, tool use, and several reasoning steps. It is designed to track progress toward genuinely useful everyday assistants.