Safety & Alignment

AI Safety

The broad practice of reducing risks and harms from AI systems.

Definition

AI safety is the broad practice of reducing risks from AI systems, ranging from everyday failures and misuse to longer-term concerns about highly capable models. It draws on alignment research, evaluation, robustness, interpretability, guardrails, and governance. Dedicated teams at major labs and independent institutes pursue it, with the shared goal of keeping systems beneficial and limiting unintended harm.

AI Safety

Definition

Related terms