Evaluation

ARC

A multiple-choice science question benchmark split into Easy and Challenge sets.

Definition

ARC is a multiple-choice question-answering benchmark built from grade-school science exams. It is divided into an Easy set and a harder Challenge set, the latter holding questions that simple keyword matching and word-frequency tricks fail to answer. The Challenge set became a standard part of language model evaluation suites and appears in most open evaluation toolkits.

ARC

Definition

Related terms