All terms
Safety & Alignment
Privacy Risk
The chance that a model exposes, infers, or fabricates sensitive information about people.
Definition
Privacy risk is the possibility that a model memorizes and reproduces personal information from its training data, infers sensitive attributes about individuals, or generates false but damaging claims about real people. It includes training data extraction, membership inference, and attribute inference. Mitigations include differential privacy, deduplicating training data, de-identification, and filtering outputs before they reach users.