All terms
Safety & Alignment
Control Problem
The challenge of keeping advanced AI systems under reliable human control.
Definition
The control problem is the challenge of keeping advanced AI systems under reliable human control, especially as they become more capable and autonomous. It concerns whether developers can correct, constrain, or shut down a system that pursues goals in unintended ways. It motivates research into corrigibility, scalable oversight, and alignment so that capable systems remain steerable and accountable.