All terms
Evaluation
ARC
A multiple-choice science question benchmark split into Easy and Challenge sets.
Definition
ARC is a multiple-choice question-answering benchmark built from grade-school science exams. It is divided into an Easy set and a harder Challenge set, the latter holding questions that simple keyword matching and word-frequency tricks fail to answer. The Challenge set became a standard part of language model evaluation suites and appears in most open evaluation toolkits.