All terms
Safety & Alignment
Data Poisoning
Corrupting training or retrieval data to influence a model's later behavior.
Definition
Data poisoning is an attack that corrupts the data a model learns from or retrieves, inserting crafted examples to influence its behavior. It can plant hidden triggers, degrade accuracy, or steer outputs toward an attacker's goal. Because models absorb patterns from large, loosely curated corpora, careful data curation and filtering are common defenses.