Agent Guardrails 🛡️

📖 Featured in: I audited my own AI agent system and found it full of holes — the security audit that spawned this 5-tool security suite. ⭐ audit-skills.sh is the comprehensive audit script at the heart of the article.

Your AI agent secretly bypasses your rules. This skill enforces them with code.

Works with: Claude Code | Clawdbot | Cursor | Any AI coding agent

Rules in markdown are suggestions. Code hooks are laws.

🚨 Stop production incidents before they happen — Born from real crashes, token leaks, and silent bypasses

The Problem

You spend hours building validation pipelines, scoring systems, and verification logic. Then your AI agent writes a "quick version" that bypasses all of it. Sound familiar?

Real Production Incidents (February 2026)

🔥 Server Crash: Bad config edit → service crash loop → server down all night
🔑 Token Leak: Notion token hardcoded in code, nearly pushed to public GitHub
🔄 Code Rewrite: Agent rewrote validated scoring logic instead of importing it, sent unverified predictions
🚀 Deployment Gap: Built new features but forgot to wire them into production, users got incomplete output

This isn't a prompting problem — it's an enforcement problem. More markdown rules won't fix it. You need mechanical enforcement that actually works.

Enforcement Hierarchy

Level	Method	Reliability
1	Code hooks (pre-commit, creation guards)	100%
2	Architectural constraints (import registries)	95%
3	Self-verification loops	80%
4	Prompt rules (AGENTS.md)	60-70%
5	Markdown documentation	40-50% ⚠️

This toolkit focuses on levels 1-2: the ones that actually work.

What's Included

Tool	Purpose
`scripts/install.sh`	One-command project setup
`scripts/pre-create-check.sh`	Lists existing modules before you create new files
`scripts/post-create-validate.sh`	Detects duplicate functions and missing imports
`scripts/check-secrets.sh`	Scans for hardcoded tokens/keys/passwords
`assets/pre-commit-hook`	Git hook that blocks bypass patterns + secrets
`assets/registry-template.py`	Template `__init__.py` for import enforcement
`references/agents-md-template.md`	Battle-tested AGENTS.md template
`scripts/audit-skills.sh`	⭐ Comprehensive security audit — scans all skills for gaps
`references/enforcement-research.md`	Full research on why code > prompts

Quick Start

For Claude Code:

git clone https://github.com/jzOcb/agent-guardrails ~/.claude/skills/agent-guardrails
cd your-project && bash ~/.claude/skills/agent-guardrails/scripts/install.sh .

For Clawdbot:

clawdhub install agent-guardrails

Manual:

bash /path/to/agent-guardrails/scripts/install.sh /path/to/your/project

📖 Claude Code detailed guide

This will:

✅ Install git pre-commit hook (blocks bypass patterns + hardcoded secrets)
✅ Create __init__.py registry template
✅ Copy check scripts to your project
✅ Add enforcement rules to your AGENTS.md

Usage

Before creating any new file:

bash scripts/pre-create-check.sh /path/to/project

Shows existing modules and functions. If it already exists, import it.

After creating/editing a file:

bash scripts/post-create-validate.sh /path/to/new_file.py

Catches duplicate functions, missing imports, and bypass patterns like "simplified version" or "temporary".

Secret scanning:

bash scripts/check-secrets.sh /path/to/project

How It Works

Pre-commit Hook

Automatically blocks commits containing:

Bypass patterns: "simplified version", "quick version", "temporary", "TODO: integrate"
Hardcoded secrets: API keys, tokens, passwords in source code

Pre-create Check

Before writing new code, the script shows you:

All existing Python modules in the project
All public functions (def declarations)
The project's __init__.py registry (if it exists)
SKILL.md contents (if it exists)

This makes it structurally difficult to "not notice" existing code.

Post-create Validation

After writing code, the script checks:

Are there duplicate function names across files?
Does the new file import from established modules?
Does it contain bypass patterns?

Import Registry

Each project gets an __init__.py that explicitly lists validated functions:

# This is the ONLY approved interface for this project
from .core import validate_data, score_item, generate_report

# New scripts MUST import from here, not reimplement

Origin Story

Born from a real incident (2026-02-02): We built a complete decision engine for prediction market analysis — scoring system, rules parser, news verification, data source validation. Then the AI agent created a "quick scan" script that bypassed ALL of it, sending unverified recommendations. Hours of careful work, completely ignored.

The fix wasn't writing more rules. It was writing code that mechanically prevents the bypass.

Research

Based on research from:

Anthropic's Claude Code Best Practices — "Unlike CLAUDE.md instructions which are advisory, hooks are deterministic"
Cursor's Scaling Agents — "Opus 4.5 tends to stop earlier and take shortcuts"
Guardrails AI Framework
NVIDIA NeMo Guardrails

Full research notes in references/enforcement-research.md.

For Clawdbot Users

This is a Clawdbot skill. Install via ClawdHub (coming soon):

clawdhub install agent-guardrails

Or clone directly:

git clone https://github.com/jzOcb/agent-guardrails.git

中文文档

完整中文文档见 references/SKILL_CN.md

🛡️ Part of the AI Agent Security Suite

Tool	What It Prevents
agent-guardrails	AI rewrites validated code, leaks secrets, bypasses standards
config-guard	AI writes malformed config, crashes gateway
upgrade-guard	Version upgrades break dependencies, no rollback
token-guard	Runaway token costs, budget overruns
process-guardian	Background processes die silently, no auto-recovery

📖 Read the full story: I audited my own AI agent system and found it full of holes

License

MIT — Use it, share it, make your agents behave.

🛡️ Part of the OpenClaw Security Suite

Guard	Purpose	Protects Against
agent-guardrails	Pre-commit hooks + secret detection	Code leaks, unsafe commits
config-guard	Config validation + auto-rollback	Gateway crashes from bad config
upgrade-guard	Safe upgrades + watchdog	Update failures, cascading breaks
token-guard	Usage monitoring + cost alerts	Budget overruns, runaway costs

📚 Full writeup: 4-Layer Defense System for AI Agents

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
assets		assets
references		references
scripts		scripts
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE_CODE_INSTALL.md		CLAUDE_CODE_INSTALL.md
GITHUB_TOPICS_GUIDE.md		GITHUB_TOPICS_GUIDE.md
LICENSE		LICENSE
PUBLISHING.md		PUBLISHING.md
PUBLISH_NOW.sh		PUBLISH_NOW.sh
README.md		README.md
SKILL.md		SKILL.md
VERSION		VERSION
X_THREAD.md		X_THREAD.md
skill.json		skill.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent Guardrails 🛡️

Your AI agent secretly bypasses your rules. This skill enforces them with code.

The Problem

Real Production Incidents (February 2026)

Enforcement Hierarchy

What's Included

Quick Start

Usage

Before creating any new file:

After creating/editing a file:

Secret scanning:

How It Works

Pre-commit Hook

Pre-create Check

Post-create Validation

Import Registry

Origin Story

Research

For Clawdbot Users

中文文档

🛡️ Part of the AI Agent Security Suite

License

🛡️ Part of the OpenClaw Security Suite

About

Uh oh!

Releases

Packages

Languages

License

jzOcb/agent-guardrails

Folders and files

Latest commit

History

Repository files navigation

Agent Guardrails 🛡️

Your AI agent secretly bypasses your rules. This skill enforces them with code.

The Problem

Real Production Incidents (February 2026)

Enforcement Hierarchy

What's Included

Quick Start

Usage

Before creating any new file:

After creating/editing a file:

Secret scanning:

How It Works

Pre-commit Hook

Pre-create Check

Post-create Validation

Import Registry

Origin Story

Research

For Clawdbot Users

中文文档

🛡️ Part of the AI Agent Security Suite

License

🛡️ Part of the OpenClaw Security Suite

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages