All terms
Data
De-Identification
Removing or obscuring identifying information so individuals cannot be recognized in data.
Definition
De-identification is the process of removing or obscuring information that could identify individuals in a dataset, such as names, addresses, and other personal details. Techniques include redaction, masking, generalization, and pseudonymization, which replaces identifiers with stand-in values. It is used to reduce privacy risk before data is shared or used for training, though residual re-identification risk can remain if not done carefully.