Skip to main content
All terms
Evaluation

WebArena

A benchmark of realistic websites for testing web-browsing AI agents.

Definition

WebArena is a benchmark made of realistic, self-contained websites — shopping, forums, maps — where an AI agent must navigate pages, fill forms, and finish multi-step tasks. It is widely used to measure how well agents can actually get things done on the web.