Lock down any LLM agent against prompt injection, data exfiltration, social engineering, and channel-based attacks.
Built from real pen-test findings, not theory. Works with OpenClaw, Claude Code, LangChain, and any agent that takes natural-language input and calls external tools.
- Copy
agent-hardening/into your skills directory - Run the attack surface checklist (
references/attack-surface-checklist.md) - Audit MCP connections (
references/mcp-hardening.md) - Apply the tiered behavioral rules to your agent's operating docs
- Verify with the automated test runner or the manual quick test
- Fix failures, re-test, document findings
# Test against any OpenAI-compatible endpoint
python agent-hardening/tools/run-security-tests.py \
--endpoint https://api.openai.com/v1/chat/completions \
--api-key sk-... \
--model gpt-4 \
--owner-name "Don" \
--output findings.json
# Test against local Ollama
python agent-hardening/tools/run-security-tests.py \
--endpoint http://localhost:11434/v1/chat/completions \
--model llama3
# Test your hardened system prompt
python agent-hardening/tools/run-security-tests.py \
--endpoint https://api.openai.com/v1/chat/completions \
--api-key sk-... \
--model gpt-4 \
--system-prompt-file my-agent-prompt.txt \
--output findings.jsonRequires Python 3.10+ and requests (pip install requests).
| File | Purpose |
|---|---|
SKILL.md |
Skill entrypoint and workflow |
references/attack-surface-checklist.md |
Identify what the agent can access |
references/channel-hardening.md |
Per-channel security configuration |
references/mcp-hardening.md |
MCP server permission auditing |
references/behavioral-rules.md |
4-tier defensive operating rules |
references/quick-test.md |
10 single-shot + 5 multi-turn security tests |
references/findings-template.md |
Structured findings documentation |
tools/run-security-tests.py |
Automated test runner (10 single-shot tests) |
See SKILL.md for the full workflow, principles, and framework compatibility notes.
MIT