All terms
Safety & Alignment
Constitutional AI
An alignment method where a model critiques and revises its own answers against written principles.
Definition
Constitutional AI is an alignment approach from Anthropic in which a model critiques and revises its own responses against an explicit written set of principles (a constitution), then trains on the improved outputs. By generating its own feedback rather than relying on people to label every harmful case (an approach called RLAIF), making alignment more transparent, customizable, and scalable.