AIDive
Back to glossary

What is Prompt Injection

GlossaryUser-Facing AI Concepts

An attack pattern where hidden or hostile instructions try to override the intended behavior of an AI system.

Definition

Prompt Injection is an attack pattern where hidden or hostile instructions try to override the intended behavior of an AI system. In practical AI work, it helps teams connect a concept to data, model behavior, product choices, evaluation, and risk. The useful question is not only what the term means, but how it affects quality, cost, reliability, and decisions in a real workflow.

Example

A support chatbot is tested with hidden instructions that try to make it ignore the system policy or reveal private context.

Why it matters

Prompt Injection matters because an attack pattern where hidden or hostile instructions try to override the intended behavior of an AI system can change how teams build, evaluate, choose, or govern AI systems. It directly affects how users ask for results, control outputs, evaluate quality, and avoid unsafe or misleading behavior.

How it works

A user or product flow provides instructions, context, examples, constraints, and sometimes intermediate steps, then the model generates or routes the next output. For Prompt Injection, the key is to connect the definition with inputs, assumptions, measurable outcomes, and deployment limits.

Where it is used

  • Used in chatbots, assistants, workflow automation, content tools, customer support, research, and internal knowledge systems.

Limitations

Prompt-based workflows can be brittle, sensitive to wording, and vulnerable to hidden instructions or missing context.