All terms
Evaluation
GAIA
A benchmark for general AI assistants on realistic, multi-step tasks needing tools.
Definition
GAIA is a benchmark for general-purpose AI assistants built around practical, open-ended tasks that people can solve but that require web browsing, tool use, and several reasoning steps. It is designed to track progress toward genuinely useful everyday assistants.