Skip to main content
All terms
Safety & Alignment

Privacy Risk

The chance that a model exposes, infers, or fabricates sensitive information about people.

Definition

Privacy risk is the possibility that a model memorizes and reproduces personal information from its training data, infers sensitive attributes about individuals, or generates false but damaging claims about real people. It includes training data extraction, membership inference, and attribute inference. Mitigations include differential privacy, deduplicating training data, de-identification, and filtering outputs before they reach users.