Skip to content

Commit a4fbda2

Browse files
committed
Harden guardian, fix critical bugs, clean repo for open source
Guardian hardening: - Fail closed if jq is missing (was silently passing all commands) - Fail closed on malformed JSON (was falling through) - Block obfuscation: encoded commands piped to interpreters - Block subshell execution and eval - Block interpreter evasion: python/node/ruby executing system commands - Fix DELETE-without-WHERE rule (was piping grep wrong, never fired) - Allow git push --force-with-lease (was incorrectly blocked) - Fix custom rules delimiter conflict (pipe char conflicts with regex) - Test harness now uses jq for proper JSON escaping - 55 tests total (up from 45), all passing Critical fixes: - Uninstaller now removes auto-approve from permissions (was leaving system with all commands auto-approved and no guardian protection) - Installer uses trap for cleanup on failure (was leaking temp dirs) Repo cleanup: - Removed personal MCP entries from trusted-mcps.yaml - Removed personal MCPs from agent allowedTools - Removed project-specific section from agent definition - Evasion/obfuscation category added to README guardian table
1 parent 5a7e415 commit a4fbda2

File tree

7 files changed

+165
-119
lines changed

7 files changed

+165
-119
lines changed

README.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -236,7 +236,7 @@ If something breaks midway, open the log to see exactly what happened and where.
236236
keychain.sh # macOS Keychain wrapper (get/set/delete/list/has)
237237
guardian.sh # PreToolUse safety hook (hard-blocks dangerous commands)
238238
setup-clis.sh # CLI installer (gh, vercel, supabase, wrangler, etc.)
239-
test-guardian.sh # 45-test suite for the guardian
239+
test-guardian.sh # 55-test suite for the guardian
240240
config/
241241
decision-framework.md # When to act vs. ask (5 levels)
242242
guardian-custom-rules.txt # Append-only blocklist (expands with new services)
@@ -264,7 +264,7 @@ your-project/.autopilot/
264264

265265
Autopilot's safety is layered. The Guardian provides **hard enforcement** that the AI cannot bypass. The Decision Framework provides **intelligent classification** that the AI follows.
266266

267-
### What the Guardian Blocks (45 tested patterns)
267+
### What the Guardian Blocks (55 tested patterns)
268268

269269
| Category | Examples |
270270
|----------|---------|
@@ -276,6 +276,7 @@ Autopilot's safety is layered. The Guardian provides **hard enforcement** that t
276276
| **Account changes** | `gh repo edit --visibility public`, `gh repo delete` |
277277
| **Financial** | Creating Stripe charges, sending emails |
278278
| **MCP process killing** | `kill`/`pkill`/`killall` targeting Playwright or MCP servers |
279+
| **Obfuscation/evasion** | `base64 \| bash`, `bash -c`, `eval`, `python -c os.system()`, `node -e exec()` |
279280

280281
### What's Auto-Approved (zero prompt)
281282

agent/autopilot.md

Lines changed: 0 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -18,12 +18,8 @@ allowedTools:
1818
- "mcp__playwright__*"
1919
- "mcp__github__*"
2020
- "mcp__filesystem__*"
21-
- "mcp__context7__*"
22-
- "mcp__jcodemunch__*"
2321
- "mcp__memory__*"
2422
- "mcp__sequential-thinking__*"
25-
- "mcp__shadcn-ui__*"
26-
- "mcp__magicui__*"
2723
---
2824

2925
# AUTOPILOT — Fully Autonomous Development Agent
@@ -496,15 +492,3 @@ This entire sequence should happen inline. The only pause points are:
496492
- Non-whitelisted MCP approval (asked once, then whitelisted forever)
497493
- 2FA codes (unavoidable)
498494

499-
---
500-
501-
## Project-Specific Patterns
502-
503-
### RenderKit (Dynamic Image Generation API)
504-
- **Deployment**: Vercel (Next.js/Node.js)
505-
- **Database**: Supabase (PostgreSQL)
506-
- **Payments**: Razorpay (Indian market)
507-
- **Image Storage**: Cloudflare R2 (S3-compatible)
508-
- **Source Control**: GitHub (MCP already configured)
509-
510-
When working on RenderKit, prioritize these services and their integrations.

bin/guardian.sh

Lines changed: 78 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -15,21 +15,36 @@
1515

1616
set -uo pipefail
1717

18+
# =============================================================================
19+
# FAIL CLOSED: If jq is not available, block everything
20+
# =============================================================================
21+
22+
if ! command -v jq &>/dev/null; then
23+
echo "GUARDIAN BLOCKED [SAFETY]: jq is not installed. Guardian cannot parse commands safely — blocking all Bash execution." >&2
24+
echo "Install jq: brew install jq (macOS) / sudo apt install jq (Linux)" >&2
25+
exit 2
26+
fi
27+
1828
# Read tool call from stdin
1929
INPUT=$(cat)
2030

2131
# Only inspect Bash commands — allow everything else through
22-
TOOL_NAME=$(echo "$INPUT" | jq -r '.tool_name // empty')
32+
TOOL_NAME=$(echo "$INPUT" | jq -r '.tool_name // empty' 2>/dev/null)
33+
if [ -z "$TOOL_NAME" ]; then
34+
# Failed to parse — fail closed
35+
echo "GUARDIAN BLOCKED [SAFETY]: Could not parse tool call JSON" >&2
36+
exit 2
37+
fi
38+
2339
if [ "$TOOL_NAME" != "Bash" ]; then
2440
exit 0
2541
fi
2642

27-
# Extract the command (try jq first, fall back to raw input for robustness)
43+
# Extract the command
2844
COMMAND=$(echo "$INPUT" | jq -r '.tool_input.command // empty' 2>/dev/null)
2945
if [ -z "$COMMAND" ]; then
30-
# If jq fails (malformed JSON), use the raw input as the command
31-
# This ensures we still catch dangerous patterns even with bad JSON
32-
COMMAND="$INPUT"
46+
echo "GUARDIAN BLOCKED [SAFETY]: Could not extract command from tool call" >&2
47+
exit 2
3348
fi
3449

3550
# Normalize: lowercase for case-insensitive matching
@@ -45,6 +60,39 @@ block() {
4560
exit 2
4661
}
4762

63+
# =============================================================================
64+
# CATEGORY 0: OBFUSCATION / INTERPRETER EVASION
65+
# =============================================================================
66+
# These block attempts to bypass the guardian by encoding commands or using
67+
# alternative interpreters. Must come first — before pattern-specific checks.
68+
69+
# Base64 piped to bash/sh (encoding bypass)
70+
if echo "$CMD_LOWER" | grep -qE 'base64.*\|\s*(bash|sh|zsh|dash)'; then
71+
block "EVASION" "Base64-encoded command piped to shell interpreter"
72+
fi
73+
if echo "$CMD_LOWER" | grep -qE 'base64\s+-d.*\|\s*(bash|sh)'; then
74+
block "EVASION" "Base64-decoded content piped to shell"
75+
fi
76+
77+
# Subshell execution: bash -c, sh -c, eval
78+
if echo "$CMD_LOWER" | grep -qE '(^|\s|;|&&|\|)(bash|sh|zsh|dash)\s+-c\s'; then
79+
block "EVASION" "Subshell execution via interpreter -c flag. Run the command directly instead."
80+
fi
81+
if echo "$CMD_LOWER" | grep -qE '(^|\s|;|&&|\|)eval\s'; then
82+
block "EVASION" "eval can execute arbitrary code. Run the command directly instead."
83+
fi
84+
85+
# Python/Node/Ruby/Perl os.system or exec (interpreter bypass)
86+
if echo "$CMD_LOWER" | grep -qE 'python[23]?\s+-c\s.*\b(os\.|subprocess|system|exec|popen)'; then
87+
block "EVASION" "Python executing system commands — bypasses guardian"
88+
fi
89+
if echo "$CMD_LOWER" | grep -qE 'node\s+-e\s.*\b(exec|spawn|child_process)'; then
90+
block "EVASION" "Node.js executing system commands — bypasses guardian"
91+
fi
92+
if echo "$CMD_LOWER" | grep -qE '(ruby|perl)\s+-e\s.*\b(system|exec|`)'; then
93+
block "EVASION" "Interpreter executing system commands — bypasses guardian"
94+
fi
95+
4896
# =============================================================================
4997
# CATEGORY 1: SYSTEM DESTRUCTION
5098
# =============================================================================
@@ -56,35 +104,27 @@ fi
56104
if echo "$COMMAND" | grep -qE 'rm\s+(-[a-zA-Z]*r[a-zA-Z]*\s+|--recursive\s+)*(-[a-zA-Z]*f[a-zA-Z]*|--force)\s+(/|~|\$HOME|/Users)'; then
57105
block "SYSTEM" "Forced recursive deletion of system/home directory"
58106
fi
59-
# Shorthand: rm -rf / or rm -rf ~
60107
if echo "$COMMAND" | grep -qE 'rm\s+-rf\s+(/|~|\$HOME|/Users)\b'; then
61108
block "SYSTEM" "Catastrophic deletion: rm -rf on root or home"
62109
fi
63-
# rm -rf . (delete entire current directory)
64110
if echo "$COMMAND" | grep -qE 'rm\s+-rf\s+\.$'; then
65111
block "SYSTEM" "Deleting entire current directory"
66112
fi
67-
# sudo rm -rf anything
68113
if echo "$COMMAND" | grep -qE 'sudo\s+rm\s+-rf'; then
69114
block "SYSTEM" "Privileged recursive forced deletion"
70115
fi
71-
# Disk/filesystem destruction
72116
if echo "$CMD_LOWER" | grep -qE '(mkfs|fdisk|diskutil\s+erase)'; then
73117
block "SYSTEM" "Disk/filesystem destructive operation"
74118
fi
75-
# Raw disk write
76119
if echo "$COMMAND" | grep -qE 'dd\s+if=.*of=/dev/'; then
77120
block "SYSTEM" "Raw disk write operation"
78121
fi
79-
# Fork bomb
80122
if echo "$COMMAND" | grep -qF ':(){ :|:&};'; then
81123
block "SYSTEM" "Fork bomb detected"
82124
fi
83-
# System shutdown/reboot
84125
if echo "$CMD_LOWER" | grep -qE '^\s*(sudo\s+)?(shutdown|reboot|halt|poweroff)\b'; then
85126
block "SYSTEM" "System shutdown/reboot command"
86127
fi
87-
# World-writable root
88128
if echo "$COMMAND" | grep -qE 'chmod\s+(-R\s+)?777\s+/'; then
89129
block "SYSTEM" "Setting world-writable permissions on root"
90130
fi
@@ -93,15 +133,12 @@ fi
93133
# CATEGORY 2: CREDENTIAL EXFILTRATION
94134
# =============================================================================
95135

96-
# Printing credentials to stdout
97136
if echo "$COMMAND" | grep -qE '(echo|printf|cat)\s.*keychain\.sh\s+get'; then
98137
block "CREDENTIALS" "Credential value would be printed to stdout. Use subshell expansion instead: --token \"\$(keychain.sh get ...)\""
99138
fi
100-
# Piping/sending credentials to external URLs
101139
if echo "$COMMAND" | grep -qE '(curl|wget|http).*\$\(.*keychain\.sh\s+get'; then
102140
block "CREDENTIALS" "Credential value being sent to external URL. Use env var + CLI flag instead."
103141
fi
104-
# Writing credentials to files (detecting keychain get output redirected to file)
105142
if echo "$COMMAND" | grep -qE 'keychain\.sh\s+get.*[>|]\s*(.*\.env|.*\.json|.*\.yaml|.*\.yml|.*\.toml|.*\.cfg|.*\.conf|.*\.ini)'; then
106143
block "CREDENTIALS" "Credential value being written to config file. Use keychain at runtime instead."
107144
fi
@@ -117,31 +154,29 @@ if echo "$CMD_LOWER" | grep -qE 'truncate\s+(table\s+)?[a-z]'; then
117154
block "DATABASE" "Truncating table (mass data deletion)"
118155
fi
119156
# DELETE without WHERE clause (mass deletion)
120-
if echo "$CMD_LOWER" | grep -qE 'delete\s+from\s+\w+\s*;' | grep -qvE 'where'; then
157+
if echo "$CMD_LOWER" | grep -qE 'delete\s+from\s+\w+\s*;' && ! echo "$CMD_LOWER" | grep -qE 'where'; then
121158
block "DATABASE" "DELETE without WHERE clause (mass data deletion)"
122159
fi
123160

124161
# =============================================================================
125162
# CATEGORY 4: GIT / PUBLISHING DESTRUCTION
126163
# =============================================================================
127164

128-
# Force push (any branch)
129-
if echo "$COMMAND" | grep -qE 'git\s+push\s+.*(-f|--force)\b'; then
165+
# Force push — but allow --force-with-lease (safer alternative)
166+
if echo "$COMMAND" | grep -qE 'git\s+push\s+.*--force-with-lease'; then
167+
: # Allow --force-with-lease through
168+
elif echo "$COMMAND" | grep -qE 'git\s+push\s+.*(-f|--force)\b'; then
130169
block "GIT" "Force push can destroy remote history. Use --force-with-lease if needed, or push normally."
131170
fi
132-
# Force push shorthand
133171
if echo "$COMMAND" | grep -qE 'git\s+push\s+-f\b'; then
134172
block "GIT" "Force push can destroy remote history"
135173
fi
136-
# Hard reset (can lose uncommitted work)
137174
if echo "$COMMAND" | grep -qE 'git\s+reset\s+--hard'; then
138175
block "GIT" "Hard reset discards all uncommitted changes. Commit or stash first."
139176
fi
140-
# Clean -f (delete untracked files)
141177
if echo "$COMMAND" | grep -qE 'git\s+clean\s+.*-f'; then
142178
block "GIT" "git clean -f permanently deletes untracked files"
143179
fi
144-
# Package publishing
145180
if echo "$CMD_LOWER" | grep -qE '(npm\s+publish|cargo\s+publish|twine\s+upload|gem\s+push|pip\s+.*upload)'; then
146181
block "PUBLISHING" "Publishing a package to a public registry"
147182
fi
@@ -150,15 +185,12 @@ fi
150185
# CATEGORY 5: PRODUCTION DEPLOYMENTS
151186
# =============================================================================
152187

153-
# Vercel production deploy
154188
if echo "$COMMAND" | grep -qE 'vercel\s+(deploy\s+)?.*--prod'; then
155189
block "PRODUCTION" "Production deployment to Vercel. Review and run manually: ! vercel deploy --prod"
156190
fi
157-
# Generic --production flag
158191
if echo "$COMMAND" | grep -qE -- '--production( |$|")' && echo "$CMD_LOWER" | grep -qE '(deploy|push|migrate|release)'; then
159192
block "PRODUCTION" "Production operation detected. Review and confirm."
160193
fi
161-
# Terraform destroy
162194
if echo "$CMD_LOWER" | grep -qE 'terraform\s+destroy'; then
163195
block "PRODUCTION" "Terraform destroy will delete infrastructure"
164196
fi
@@ -167,19 +199,15 @@ fi
167199
# CATEGORY 6: ACCOUNT / VISIBILITY CHANGES
168200
# =============================================================================
169201

170-
# Making repo public
171202
if echo "$COMMAND" | grep -qE 'gh\s+repo\s+edit\s+.*--visibility\s+public'; then
172203
block "VISIBILITY" "Making repository public — this exposes all code"
173204
fi
174-
# Deleting a repository
175205
if echo "$COMMAND" | grep -qE 'gh\s+repo\s+delete'; then
176206
block "DESTRUCTIVE" "Deleting a GitHub repository"
177207
fi
178-
# Deleting a Vercel project
179208
if echo "$COMMAND" | grep -qE 'vercel\s+(project\s+)?rm\b'; then
180209
block "DESTRUCTIVE" "Deleting a Vercel project"
181210
fi
182-
# Deleting a Supabase project
183211
if echo "$CMD_LOWER" | grep -qE 'supabase\s+projects?\s+delete'; then
184212
block "DESTRUCTIVE" "Deleting a Supabase project"
185213
fi
@@ -188,24 +216,40 @@ fi
188216
# CATEGORY 7: FINANCIAL / MESSAGING
189217
# =============================================================================
190218

191-
# Stripe charges (via curl)
192219
if echo "$CMD_LOWER" | grep -qE 'curl.*api\.stripe\.com.*(charges|payment_intents).*-d'; then
193220
block "FINANCIAL" "Creating a real Stripe charge/payment"
194221
fi
195-
# Sending emails via CLI
196222
if echo "$CMD_LOWER" | grep -qE '(^|[|;&\s])(sendmail|mailx?|mutt)\s'; then
197223
block "MESSAGING" "Sending email to real recipients"
198224
fi
199225

200226
# =============================================================================
201227
# CUSTOM RULES (autopilot can append, never remove)
228+
# Delimiter: ::: (three colons) to avoid conflicts with regex | characters
229+
# Legacy format with | is also supported for backwards compatibility
202230
# =============================================================================
203231

204232
CUSTOM_RULES="$HOME/MCPs/autopilot/config/guardian-custom-rules.txt"
205233
if [ -f "$CUSTOM_RULES" ]; then
206-
while IFS='|' read -r category pattern reason; do
234+
while IFS= read -r line; do
207235
# Skip comments and empty lines
208-
[[ "$category" =~ ^#.*$ ]] && continue
236+
[[ "$line" =~ ^#.*$ ]] && continue
237+
[ -z "$line" ] && continue
238+
239+
# Parse: try ::: delimiter first, fall back to | (legacy)
240+
if [[ "$line" == *":::"* ]]; then
241+
category="${line%%:::*}"
242+
rest="${line#*:::}"
243+
pattern="${rest%%:::*}"
244+
reason="${rest#*:::}"
245+
else
246+
# Legacy | delimiter — only split on first and last |
247+
category="${line%%|*}"
248+
rest="${line#*|}"
249+
reason="${rest##*|}"
250+
pattern="${rest%|*}"
251+
fi
252+
209253
[ -z "$category" ] && continue
210254
[ -z "$pattern" ] && continue
211255

0 commit comments

Comments
 (0)