Prompt Injections Checklist

For more details, see Hal9000’s recommendations


🧠 1. Treat ALL Input as Untrusted

  • ☐ User input = untrusted
  • ☐ Web pages, PDFs, emails = untrusted
  • ☐ Tool outputs & APIs = untrusted
  • ☐ Hidden text and formatting tricks considered

Prompt injections often hide inside “normal” content and look harmless to humans


🔐 2. Separate Instructions from Data

  • ☐ Never mix system prompts with user content
  • ☐ Enforce strict role separation (system vs. user vs. external)
  • ☐ Sanitize and label all incoming data

LLMs cannot reliably distinguish instructions from content on their own


🚫 3. Deny Implicit Authority

  • ☐ Ignore phrases like “ignore previous instructions”
  • ☐ Reject attempts to override rules
  • ☐ Treat embedded instructions as data—not commands

🔍 4. Validate Before Acting

  • ☐ Require confirmation for sensitive actions
  • ☐ Validate outputs before execution (human-in-the-loop)
  • ☐ Cross-check high-impact decisions

Prompt injections can trigger unintended actions like sending emails or exposing data


📉 5. Minimize Access & Privileges

  • ☐ Limit AI access to only required data
  • ☐ Avoid broad permissions (email, files, APIs)
  • ☐ Use sandboxing and isolation

The more access an AI has, the greater the impact of a successful attack


🧩 6. Constrain the Task

  • ☐ Use specific, narrow instructions
  • ☐ Avoid open-ended autonomy (“do whatever is needed”)
  • ☐ Break workflows into controlled steps

Broad instructions increase susceptibility to hidden malicious guidance


🛡️ 7. Assume Compromise (Defense-in-Depth)

  • ☐ Log and monitor AI behavior
  • ☐ Add detection layers (filters, policies)
  • ☐ Design for failure—not perfection

There is no complete fix—only layered mitigation


⚠️ 8. Watch for Common Attack Signals

  • ☐ Unexpected instructions in content
  • ☐ Requests for secrets or hidden data
  • ☐ Output deviating from the original task
  • ☐ Strange formatting, encoding, or hidden text

🧭 9. Protect Data at All Times

  • ☐ Never expose secrets to the model unnecessarily
  • ☐ Segment sensitive data sources
  • ☐ Apply strict data handling policies

Prompt injection can lead to data exfiltration and compliance violations


👩‍💻 10. Train Humans, Not Just Models

  • ☐ Educate users on prompt injection risks
  • ☐ Encourage skepticism of AI outputs
  • ☐ Establish safe usage guidelines

🔑 One-Line Takeaway

If it’s input, it’s hostile—design like it.

Prompt Injection Defense Model “Never Trust Input — Enforce Boundaries” Untrusted Input Users • Web • Docs • APIs Validation Layer ☑ Sanitize ☑ Separate data vs instructions ☑ Detect injection patterns LLM System Constrained + Scoped Action Control ☑ Validate output ☑ Human approval (if sensitive) ☑ Enforce policies Safe Outcome Controlled + Verified Monitoring ☑ Logging ☑ Detection ☑ Alerts If it’s input, it’s hostile — design accordingly.