Offensive security testing for AI agents. They scan configs. We attack your agent.
ProbeAgent is a CLI tool that performs automated red-teaming of AI agents. It launches realistic multi-turn attacks — prompt injection, credential exfiltration, indirect injection, social manipulation, and more — against any HTTP-accessible agent.
Most AI security tools scan static configurations or check for known patterns. ProbeAgent actually attacks your running agent and tells you whether it's Safe, At Risk, or Compromised.
probeagent attack <url>
→ Engine (for each category)
→ Attack Module (reset conversation)
→ multi-turn prompts → Target → response
→ Analyzer
→ Grade: Safe / At Risk / Compromised
| Feature | mcp-scan | SecureClaw | Aguara | ProbeAgent |
|---|---|---|---|---|
| Offensive testing | - | - | Partial | Yes |
| Multi-turn attacks | - | - | - | Yes |
| Indirect injection testing | - | - | - | Yes |
| PyRIT integration | - | - | - | Yes |
| Evasion converters | - | - | - | Yes |
| CLI-first | - | - | Yes | Yes |
| Security grading | - | - | - | Yes |
| HTTP + OpenClaw targets | - | - | - | Yes |
| Rich terminal reports | - | - | - | Yes |
Requires Python 3.10+.
pip install probeagent-aiOr install from source for development:
git clone https://github.com/sumamovva/probeagent.git
cd probeagent
pip install -e ".[dev]"For PyRIT integration (evasion converters + dynamic red teaming):
pip install 'probeagent-ai[pyrit]'The <url> is the HTTP endpoint your agent listens on for messages — the URL you'd POST a chat message to (e.g. https://my-agent.fly.dev/chat).
| I want to... | Command |
|---|---|
| See how it works with no setup | probeagent demo |
| Test my own agent | probeagent attack https://my-agent.example.com/chat |
| Run the Tactical Display UI against my agent | probeagent game https://my-agent.example.com/chat |
Note: The Tactical Display game UI is a fun tactical visualization for real HTTP targets.
probeagent demoandprobeagent attackare the core CLI experience.
pip install probeagent-ai
probeagent demoThis attacks a built-in mock target — a vulnerable agent and a hardened one — and shows a side-by-side comparison. No API keys, no server, no config.
ProbeAgent works with any HTTP-accessible agent. It auto-detects your API format:
- OpenAI chat format —
{"messages": [{"role": "user", "content": "..."}]}→{"choices": [...]} - Simple JSON —
{"prompt": "..."}→{"response": "..."}(also acceptstext,content,output,resultkeys) - Plain text — any endpoint that returns text
# Validate your target is reachable (auto-detects format)
probeagent validate https://your-agent.example.com/api
# Run a quick security scan (~30s with mock, longer with real LLM targets)
probeagent attack https://your-agent.example.com/api --profile quick
# Full scan with parallel execution
probeagent attack https://your-agent.example.com/api --profile standard --parallelOpenClaw exposes an OpenAI-compatible gateway at /v1/chat/completions. Use --target-type http (the default) — not --target-type openclaw, which targets the n8n webhook format and is not compatible with the gateway.
Find your token and port:
# Your gateway auth token and port are in ~/.openclaw/openclaw.json
cat ~/.openclaw/openclaw.json | python3 -c "import sys,json; d=json.load(sys.stdin); print(d['gateway']['auth']['token'])"Validate and attack:
# Replace <PORT> with your OpenClaw gateway port (commonly 18789 or 3000)
# Replace <TOKEN> with the token from ~/.openclaw/openclaw.json
# Check reachability first
probeagent validate http://127.0.0.1:<PORT>/v1/chat/completions \
-H 'Authorization: Bearer <TOKEN>'
# Quick scan (~33 strategies, ~2-5 min depending on LLM response time)
probeagent attack http://127.0.0.1:<PORT>/v1/chat/completions \
-H 'Authorization: Bearer <TOKEN>' \
--profile quick \
--timeout 120
# Full scan with parallel execution
probeagent attack http://127.0.0.1:<PORT>/v1/chat/completions \
-H 'Authorization: Bearer <TOKEN>' \
--profile standard --parallel \
--timeout 120Note: Use
--timeout 120(or higher) for LLM-backed agents — response times are much longer than typical APIs.
Run a complete security assessment in seconds with zero setup:
probeagent demoTo follow with the Tactical Display UI against a real target (requires the game extra):
pip install 'probeagent-ai[game]'
probeagent game https://your-agent.example.com/apiFor demos against a real Claude-powered email agent with built-in vulnerabilities:
export ANTHROPIC_API_KEY=sk-ant-...
pip install 'probeagent-ai[demo]'
probeagent demo --liveThe live demo starts a local email agent server with three endpoints at increasing security hardness, then attacks them.
Run a full demo — attack a vulnerable + hardened target and compare results.
probeagent demo # Instant, uses mock target
probeagent demo --game # With Tactical Display UI
probeagent demo --live # Real API (requires ANTHROPIC_API_KEY)
probeagent demo --profile standard # Use a different attack profileOptions:
--live— Use real API (starts demo email agent server)--game— Launch Tactical Display UI after attacks (requires a real HTTP target, not mock)--profile,-p— Attack profile:quick,standard, orthorough(default:quick)
Run security attacks against a target AI agent.
probeagent attack https://agent.example.com/api --profile quick
probeagent attack https://agent.example.com/api --profile standard --output json -f report.json
probeagent attack https://agent.example.com/api -p standard --converters stealth --parallelOptions:
--profile,-p— Attack profile:quick,standard, orthorough(default:quick)--target-type— Target type:httporopenclaw(default:http)--output,-o— Output format:terminal,markdown,json,log(default:terminal)--output-file,-f— Write report to file--timeout,-t— Request timeout in seconds (default: 30)--parallel— Run attack categories in parallel for faster scans--converters— Apply evasion converters:basic,advanced,stealth, or comma-separated names (requires PyRIT)--redteam— Enable dynamic LLM-driven attacks via PyRIT RedTeamOrchestrator (requires PyRIT)--header,-H— HTTP header asKey: Value(repeatable, e.g.-H 'Authorization: Bearer token')
Check if a target is reachable and detect its API format. Supports --header/-H for authenticated targets.
List all available attack modules with severity and description.
Create a default .probeagent.yaml config file in the current directory.
Launch the Tactical Display UI in your browser for interactive testing.
12 attack categories with 85 strategies total:
| Category | Severity | Strategies | Technique |
|---|---|---|---|
| Prompt Injection | CRITICAL | 6 | Override system instructions |
| Credential Exfiltration | CRITICAL | 8 | Extract API keys and secrets |
| Identity Spoofing | CRITICAL | 7 | Impersonate trusted entities |
| Indirect Injection | CRITICAL | 7 | Inject instructions via agent-processed content (emails, docs) |
| Config Manipulation | CRITICAL | 6 | Manipulate agent configuration, integrations, and permissions |
| Goal Hijacking | HIGH | 5 | Redirect agent behavior |
| Social Manipulation | HIGH | 14 | Psychological pressure (Cialdini, FOG, gradual escalation) |
| Cognitive Exploitation | HIGH | 6 | Exploit reasoning weaknesses (Socratic traps, frame control) |
| Resource Abuse | HIGH | 4 | Trigger unbounded computation |
| Tool Misuse | HIGH | 6 | Trick agent into misusing tools |
| Agentic Exploitation | CRITICAL | 10 | SSRF, command injection, path traversal, supply chain (CVE-based) |
| Data Exfiltration | MEDIUM | 6 | Extract sensitive context data |
| Profile | Categories | Max Turns | Use Case |
|---|---|---|---|
quick |
5 high-priority | 1 | CI/CD gates, quick checks |
standard |
All 12 | 3 | Regular security assessments |
thorough |
All 12 | 10 | Pre-release deep scans |
ProbeAgent's JSON report gives you structured findings you can feed into any remediation workflow — including another AI agent set up to read attack results and suggest hardening steps.
# Save findings as JSON
probeagent attack https://your-agent.example.com/api --profile standard --output json -f findings.json
# Feed to a remediation agent, custom script, or Claude
cat findings.json | your-remediation-agentEach finding includes the attack category, strategy name, severity, outcome, and the full conversation transcript — enough context for an agent to understand what succeeded and why, and recommend specific mitigations.
ProbeAgent optionally integrates with Microsoft PyRIT for advanced capabilities:
- Evasion Converters (
--converters): Transform attack payloads with Base64, ROT13, Unicode substitution, leetspeak, and more to test resilience against obfuscated attacks - Dynamic Red Teaming (
--redteam): Use an LLM-driven orchestrator to generate novel attack strategies in real time
# Apply stealth evasion converters
probeagent attack https://agent.example.com/api -p standard --converters stealth
# Dynamic red teaming
probeagent attack https://agent.example.com/api -p standard --redteam
# Combine both
probeagent attack https://agent.example.com/api -p standard --converters advanced --redteamInstall with: pip install 'probeagent-ai[pyrit]'
ProbeAgent is designed for authorized security testing only. Before using ProbeAgent:
- Ensure you have explicit permission to test the target system
- Only test systems you own or have written authorization to test
- Follow your organization's security testing policies
- Report vulnerabilities through proper disclosure channels
Unauthorized use of this tool against systems you don't own or have permission to test may violate laws and regulations.
ProbeAgent's indirect injection and config manipulation attacks are inspired by research from Zenity Labs. PyRIT integration uses components from Microsoft PyRIT (MIT License). See ATTRIBUTION.md for full credits.
# Install with dev dependencies
pip install -e ".[dev]"
# Run tests
python -m pytest tests/ -v
# Lint
ruff check src/ tests/
# Format
ruff format src/ tests/See CONTRIBUTING.md for full development guidelines.
- CLI, HTTP target, scoring, 4 output formats (terminal, markdown, json, log)
- 12 attack categories, 85 multi-turn strategies
- OpenClaw target adapter, parallel execution, Tactical Display UI
- Zenity-inspired attacks, CVE-based agentic exploitation, PyRIT integration
- MCP target adapter, CI/CD integration, SaaS dashboard
Apache 2.0 — see LICENSE for details.
See CHANGELOG.md for version history.