Skip to main content
All terms
Safety & Alignment

Data Poisoning

Corrupting training or retrieval data to influence a model's later behavior.

Definition

Data poisoning is an attack that corrupts the data a model learns from or retrieves, inserting crafted examples to influence its behavior. It can plant hidden triggers, degrade accuracy, or steer outputs toward an attacker's goal. Because models absorb patterns from large, loosely curated corpora, careful data curation and filtering are common defenses.