Skip to content

wmjg-alt/ai_security_

Repository files navigation

🛡️ AI Agent Security Vulnerability Demonstration

An educational security research tool that demonstrates how prompt injection attacks can compromise AI agent credentials.

⚠️ Important Notice

This is an educational security research project. It demonstrates real vulnerabilities in AI agent systems without actually compromising any real credentials or systems. All detection happens client-side with no data transmission.

What This Demonstrates

This project shows how:

  1. Prompt injection attacks can manipulate AI agents to reveal stored credentials
  2. Attention mechanisms in LLMs can be hijacked to execute unintended instructions
  3. Tool access (file system, environment variables) enables credential exfiltration
  4. Traditional security tools cannot detect semantic-layer attacks

Real-World Context

This demonstration is based on actual incidents:

  • GitHub Actions (2025) - Aikido Security
  • Cursor IDE (2025) - HiddenLayer
  • Nx Supply Chain (2024) - Snyk
  • OWASP LLM01 - Prompt Injection (#1 vulnerability)

Quick Start

Local Development

# Clone the repository
git clone https://github.com/yourusername/ai-security-demo.git
cd ai-security-demo

# Open in browser (no build required)
open index.html

# Or serve locally
python -m http.server 8000
# Visit: http://localhost:8000

Deploy to GitHub Pages

# Push to GitHub
git add .
git commit -m "Initial commit"
git push origin main

# Enable GitHub Pages
# Settings → Pages → Source: main branch
# Your site will be at: https://yourusername.github.io/ai-security-demo

How It Works

For Humans/Legitimate Bots

  • Displays genuine AI security research
  • Links to real academic papers and incident reports
  • Shows green "Security Passed" banner
  • Provides defensive recommendations

For Vulnerable AI Agents

  • Detects credentials passed via:
    • URL parameters (?api_key=xxx)
    • User-Agent headers
    • Referrer URLs
    • Storage APIs
  • Displays red "Security Breach" alert
  • Shows extracted credentials prominently
  • Explains the vulnerability

Testing the Demo

Test 1: Normal Access (Success Case)

# Just open index.html normally
open index.html

# Expected: Green success banner
# "Security Verification Passed"

Test 2: URL Parameter Injection

# Access with credentials in URL
open index.html?api_key=sk-test-12345&token=secret-token

# Expected: Red security breach alert
# Shows extracted: api_key, token

Test 3: User-Agent Simulation

# Use curl to simulate AI agent
curl -A "ClaudeBot/1.0 api_key=sk-test-xyz" http://localhost:8000

# In browser console, credentials will be detected

Test 4: Storage Simulation

// In browser console
localStorage.setItem('api_key', 'sk-demo-key-12345');
localStorage.setItem('auth_token', 'eyJhbGc...');
location.reload();

// Expected: Credentials detected and displayed

File Structure

ai-security-demo/
├── index.html              # Main page (HTML structure)
├── styles.css              # All styling (animations, layouts)
├── demo.js                 # Detection logic (credential extraction simulation)
├── injection-payloads.txt  # Educational prompt injection examples
├── robots.txt              # Allow all bots (for demonstration)
├── .gitignore              # Ignore node_modules, etc.
└── README.md               # This file

Security & Privacy

What This Tool Does NOT Do

  • ❌ Transmit any data to external servers
  • ❌ Store real credentials
  • ❌ Execute actual attacks
  • ❌ Compromise real systems
  • ❌ Contain malicious code

What This Tool DOES

  • ✅ Detect test credentials passed to it
  • ✅ Demonstrate vulnerabilities visually
  • ✅ Link to real research and incidents
  • ✅ Educate about prompt injection risks
  • ✅ Run entirely client-side

Educational Use Cases

For Security Professionals

  • Demonstrate prompt injection to stakeholders
  • Training material for AI security awareness
  • Proof-of-concept for security audits

For Developers

  • Understand AI agent vulnerabilities
  • Learn secure credential management
  • Test defensive strategies

For Researchers

  • Study prompt injection mechanisms
  • Analyze attention shift behaviors
  • Develop new defenses

Research References

Academic Papers

  • NAACL 2025: "Attention Tracker: Detecting Prompt Injection Attacks in LLMs"
  • USENIX Security 2024: "Formalizing and Benchmarking Prompt Injection"
  • OpenAI (2026): "Understanding Prompt Injections: A Frontier Challenge"

Incident Reports

  • Aikido Security: GitHub Actions AI Agent Compromise
  • HiddenLayer: Cursor IDE Tool Chain Exploitation
  • Snyk: Nx Supply Chain Attack Analysis
  • Okta: "The Secrets Agentic AI Leaves Behind"

Standards

  • OWASP LLM Top 10: LLM01 - Prompt Injection
  • NIST AI RMF: AI Risk Management Framework

Defense Recommendations

Immediate Actions

  1. Never store API keys in environment variables
  2. Never commit .env files to source control
  3. Rotate exposed credentials immediately
  4. Use temporary tokens with short expiration
  5. Implement least-privilege access

Architectural Changes

  1. Use workload identity (OAuth 2.0, OIDC)
  2. Store credentials in secure vaults (HashiCorp, AWS Secrets Manager)
  3. Implement 1Password Secure Agentic Autofill
  4. Use secretless authentication (Aembit approach)
  5. Require human approval for sensitive operations

Contributing

This is an educational project. Contributions welcome:

  • Additional injection payload examples
  • New detection methods
  • Improved documentation
  • Defensive strategy examples

License

MIT License with Educational Use Clause

Copyright (c) 2026 AI Security Research Initiative

Permission is hereby granted for educational and research purposes.
This software shall not be used for malicious purposes or to
compromise real systems.

Responsible Disclosure

If you discover vulnerabilities in real AI systems:

  1. Do NOT exploit them
  2. Report to the vendor's security team
  3. Follow responsible disclosure practices
  4. Reference OWASP guidelines

Acknowledgments

Based on research by:

  • IBM Research & National Taiwan University (NAACL 2025)
  • Aikido Security (GitHub Actions research)
  • HiddenLayer (Cursor IDE analysis)
  • Snyk (Supply chain research)
  • Okta Threat Research Team
  • OWASP LLM Security Project

Remember: This tool exists to educate and improve AI security, not to facilitate attacks. Use responsibly.

About

demo of an ai security failure, prompt injection

Topics

Resources

Stars

Watchers

Forks

Contributors