All terms
Safety & Alignment
Explainability
How well a person can understand why an AI system produced a given output.
Definition
Explainability is how well a person can understand why an AI system produced a particular output. Because many models are opaque, explainability methods try to surface the reasons behind a decision in human terms. It matters for trust, debugging, fairness, and rules that give people a right to an explanation, and it is closely related to interpretability.