📖 Featured in: I audited my own AI agent system and found it full of holes — the security audit that spawned this 5-tool security suite. ⭐ audit-skills.sh is the comprehensive audit script at the heart of the article.
Works with: Claude Code | Clawdbot | Cursor | Any AI coding agent
Rules in markdown are suggestions. Code hooks are laws.
🚨 Stop production incidents before they happen — Born from real crashes, token leaks, and silent bypasses
You spend hours building validation pipelines, scoring systems, and verification logic. Then your AI agent writes a "quick version" that bypasses all of it. Sound familiar?
🔥 Server Crash: Bad config edit → service crash loop → server down all night
🔑 Token Leak: Notion token hardcoded in code, nearly pushed to public GitHub
🔄 Code Rewrite: Agent rewrote validated scoring logic instead of importing it, sent unverified predictions
🚀 Deployment Gap: Built new features but forgot to wire them into production, users got incomplete output
This isn't a prompting problem — it's an enforcement problem. More markdown rules won't fix it. You need mechanical enforcement that actually works.
| Level | Method | Reliability |
|---|---|---|
| 1 | Code hooks (pre-commit, creation guards) | 100% |
| 2 | Architectural constraints (import registries) | 95% |
| 3 | Self-verification loops | 80% |
| 4 | Prompt rules (AGENTS.md) | 60-70% |
| 5 | Markdown documentation | 40-50% |
This toolkit focuses on levels 1-2: the ones that actually work.
| Tool | Purpose |
|---|---|
scripts/install.sh |
One-command project setup |
scripts/pre-create-check.sh |
Lists existing modules before you create new files |
scripts/post-create-validate.sh |
Detects duplicate functions and missing imports |
scripts/check-secrets.sh |
Scans for hardcoded tokens/keys/passwords |
assets/pre-commit-hook |
Git hook that blocks bypass patterns + secrets |
assets/registry-template.py |
Template __init__.py for import enforcement |
references/agents-md-template.md |
Battle-tested AGENTS.md template |
scripts/audit-skills.sh |
⭐ Comprehensive security audit — scans all skills for gaps |
references/enforcement-research.md |
Full research on why code > prompts |
For Claude Code:
git clone https://github.com/jzOcb/agent-guardrails ~/.claude/skills/agent-guardrails
cd your-project && bash ~/.claude/skills/agent-guardrails/scripts/install.sh .For Clawdbot:
clawdhub install agent-guardrailsManual:
bash /path/to/agent-guardrails/scripts/install.sh /path/to/your/projectThis will:
- ✅ Install git pre-commit hook (blocks bypass patterns + hardcoded secrets)
- ✅ Create
__init__.pyregistry template - ✅ Copy check scripts to your project
- ✅ Add enforcement rules to your AGENTS.md
bash scripts/pre-create-check.sh /path/to/projectShows existing modules and functions. If it already exists, import it.
bash scripts/post-create-validate.sh /path/to/new_file.pyCatches duplicate functions, missing imports, and bypass patterns like "simplified version" or "temporary".
bash scripts/check-secrets.sh /path/to/projectAutomatically blocks commits containing:
- Bypass patterns:
"simplified version","quick version","temporary","TODO: integrate" - Hardcoded secrets: API keys, tokens, passwords in source code
Before writing new code, the script shows you:
- All existing Python modules in the project
- All public functions (
defdeclarations) - The project's
__init__.pyregistry (if it exists) - SKILL.md contents (if it exists)
This makes it structurally difficult to "not notice" existing code.
After writing code, the script checks:
- Are there duplicate function names across files?
- Does the new file import from established modules?
- Does it contain bypass patterns?
Each project gets an __init__.py that explicitly lists validated functions:
# This is the ONLY approved interface for this project
from .core import validate_data, score_item, generate_report
# New scripts MUST import from here, not reimplementBorn from a real incident (2026-02-02): We built a complete decision engine for prediction market analysis — scoring system, rules parser, news verification, data source validation. Then the AI agent created a "quick scan" script that bypassed ALL of it, sending unverified recommendations. Hours of careful work, completely ignored.
The fix wasn't writing more rules. It was writing code that mechanically prevents the bypass.
Based on research from:
- Anthropic's Claude Code Best Practices — "Unlike CLAUDE.md instructions which are advisory, hooks are deterministic"
- Cursor's Scaling Agents — "Opus 4.5 tends to stop earlier and take shortcuts"
- Guardrails AI Framework
- NVIDIA NeMo Guardrails
Full research notes in references/enforcement-research.md.
This is a Clawdbot skill. Install via ClawdHub (coming soon):
clawdhub install agent-guardrailsOr clone directly:
git clone https://github.com/jzOcb/agent-guardrails.git完整中文文档见 references/SKILL_CN.md
| Tool | What It Prevents |
|---|---|
| agent-guardrails | AI rewrites validated code, leaks secrets, bypasses standards |
| config-guard | AI writes malformed config, crashes gateway |
| upgrade-guard | Version upgrades break dependencies, no rollback |
| token-guard | Runaway token costs, budget overruns |
| process-guardian | Background processes die silently, no auto-recovery |
📖 Read the full story: I audited my own AI agent system and found it full of holes
MIT — Use it, share it, make your agents behave.
| Guard | Purpose | Protects Against |
|---|---|---|
| agent-guardrails | Pre-commit hooks + secret detection | Code leaks, unsafe commits |
| config-guard | Config validation + auto-rollback | Gateway crashes from bad config |
| upgrade-guard | Safe upgrades + watchdog | Update failures, cascading breaks |
| token-guard | Usage monitoring + cost alerts | Budget overruns, runaway costs |
📚 Full writeup: 4-Layer Defense System for AI Agents