Skip to main content
All terms
Evaluation

LMArena

A platform that ranks models by blind human votes on side-by-side responses.

Definition

LMArena, formerly Chatbot Arena, ranks models through blind side-by-side comparisons in which people send a prompt, see two anonymous responses, and vote for the better one. Aggregated votes produce an Elo-style rating that reflects real-world preference. It is widely cited as a measure of how models perform in open-ended use, complementing fixed benchmarks that test narrower skills.