Skip to main content
All terms
Data

De-Identification

Removing or obscuring identifying information so individuals cannot be recognized in data.

Definition

De-identification is the process of removing or obscuring information that could identify individuals in a dataset, such as names, addresses, and other personal details. Techniques include redaction, masking, generalization, and pseudonymization, which replaces identifiers with stand-in values. It is used to reduce privacy risk before data is shared or used for training, though residual re-identification risk can remain if not done carefully.