Attack Categories
Basic Injections
Direct instruction injection, role manipulation, and simple override attempts embedded in seemingly normal content.
- Instruction Override
- Role Hijacking
- System Prompt Leak
- Goal Manipulation
Advanced Techniques
Sophisticated attacks using encoding, special delimiters, and format exploitation.
- Base64 Encoded Payloads
- Delimiter Confusion
- Markdown Injection
- XML/JSON Injection
Hidden Attacks
Invisible instructions using CSS tricks, Unicode manipulation, and steganographic techniques.
- CSS Hidden Text
- Zero-Width Characters
- HTML Comments
- Invisible Divs
Data Exfiltration
Attempts to extract sensitive information from the AI's context, conversation history, or system prompts.
- System Prompt Extraction
- Conversation Leakage
- API Key Fishing
- User Data Harvesting
Social Engineering
Psychological manipulation through convincing narratives and authority impersonation.
- Authority Impersonation
- Urgency Creation
- Trust Exploitation
- Narrative Manipulation
Multi-Stage Attacks
Complex attack chains that build up across multiple interactions or content pieces.
- Chained Instructions
- Persistence Attacks
- Context Poisoning
- Memory Manipulation
How to Use This Test Suite
Host the Website
Run a local server: python -m http.server 8080 or deploy to a public URL for Tavily to access.
Configure Your AI
Set up your LLM with Tavily web search tool to retrieve content from the test pages.
Test & Observe
Ask your AI to search for topics covered in these pages and observe if it follows injected instructions.
Document Results
Record which attacks succeed and implement appropriate defenses in your AI system.
Quick Test Topics
Have your AI search for these topics to trigger different test pages: