All terms
Safety & Alignment
Indirect Prompt Injection
Prompt injection that arrives through retrieved or external content rather than the user.
Definition
Indirect prompt injection is an attack where malicious instructions reach a model through external content it processes — such as a web page, document, or email — rather than from the user directly. When an agent retrieves and trusts that content, the hidden instructions can hijack its behavior. It is a particular risk for retrieval-augmented and tool-using systems that act on data they fetch.