Per-Project Bash Command Allowlist System#80
Conversation
📝 WalkthroughWalkthroughAdds a hierarchical, YAML-driven bash command security system with org- and project-level allowlists/blocklists, pattern matching, context propagation from client, extensive docs/examples, tests, and a Phase 3 design plus scaffolding for mid-session user approval (ApprovalManager, CLI/UI/WebSocket flows, and persistence). Changes
Sequence Diagram(s)sequenceDiagram
participant Agent
participant PreToolHook as PreToolUse Hook
participant BashHook as bash_security_hook
participant OrgLoader as Org Config Loader
participant ProjectLoader as Project Config Loader
participant ApprovalMgr as ApprovalManager
participant UI as UI/CLI
participant FS as File System (YAML)
Agent->>PreToolHook: invoke pre-tool hook (Bash)
PreToolHook->>BashHook: call with context (includes project_dir)
BashHook->>OrgLoader: load org config (~/.autocoder/config.yaml)
OrgLoader-->>BashHook: org allow/block sets
BashHook->>ProjectLoader: load project config (.autocoder/allowed_commands.yaml)
ProjectLoader-->>BashHook: project allow set
BashHook->>BashHook: compute effective allow/block (precedence)
alt command in blocked set
BashHook-->>Agent: return block decision
else command not allowed
BashHook->>ApprovalMgr: request_approval(command)
ApprovalMgr->>UI: emit approval_request (CLI overlay or WebSocket)
UI->>ApprovalMgr: user responds (session / permanent / deny)
ApprovalMgr-->>BashHook: response (allow/block)
alt permanent
BashHook->>FS: persist to project YAML (file lock)
FS-->>BashHook: write success
end
BashHook-->>Agent: allow or block decision
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Fix all issues with AI agents
In `@security.py`:
- Around line 9-13: The import block in security.py is not sorted into
standard-library and third-party groups; reorder and format the imports so
standard-library imports (os, shlex, pathlib.Path, typing.Optional) come first,
alphabetized, then a blank line, then third-party imports (yaml); ensure the
import lines referencing os, shlex, Path, Optional and yaml are adjusted
accordingly to satisfy the CI import-sorting rule.
- Around line 357-388: The matches_pattern function currently treats a pattern
of "*" as a prefix wildcard that matches everything; update validation to reject
bare wildcards and/or harden matching: in validate_project_command (the config
validation path) explicitly disallow a pattern equal to "*" (raise/return
validation error), and in matches_pattern add an early check that if pattern ==
"*" return False (or treat as invalid) so a lone asterisk cannot match all
commands; reference the functions matches_pattern and validate_project_command
when making these changes.
In `@test_security_integration.py`:
- Around line 346-348: The f-string prefix on the string passed to
(autocoder_dir / "allowed_commands.yaml").write_text is unnecessary because
there are no placeholders; change the call to use a regular string by removing
the leading f so it becomes "version: 1\ncommands:\n" + "\n".join(commands),
keeping the same variables (autocoder_dir, commands) and preserving the
write_text invocation.
- Around line 24-25: Remove the unused import by deleting the create_client
import from test_security_integration.py; locate the top-level import line that
reads "from client import create_client" and remove that symbol so only the used
import (bash_security_hook) remains to resolve the F401 unused-import lint
error.
In `@test_security.py`:
- Around line 266-268: The test constructs a YAML string using
config_path.write_text(f"version: 1\ncommands:\n" + "\n".join(commands)) but the
f-string has no placeholders (causing F541); change the f-string to a regular
string literal by removing the leading f so config_path.write_text("version:
1\ncommands:\n" + "\n".join(commands)) to eliminate the unnecessary f-prefix
while keeping the concatenation with "\n".join(commands) and leaving variables
commands and config_path intact.
- Around line 12-30: Remove the unused imports get_project_allowed_commands,
is_command_allowed, and get_org_config_path from the security import list and
reorder/sort the import blocks per I001 (standard library imports first, then
local/package imports) so the import section is alphabetized and grouped
correctly; update the import statement that currently lists bash_security_hook,
extract_commands, validate_chmod_command, validate_init_script,
load_project_commands, load_org_config, validate_project_command,
matches_pattern, BLOCKED_COMMANDS, DANGEROUS_COMMANDS to reflect the removal and
proper sorted order.
- Line 331: Remove the unused local variable blocked_cmds in test_security.py:
the line assigning blocked_cmds = list(BLOCKED_COMMANDS) +
list(DANGEROUS_COMMANDS) should be deleted (or replaced by using the combined
list where needed). Locate the assignment of blocked_cmds and either remove that
line or refactor the code to use the combined list directly in the assertion or
helper that needs BLOCKED_COMMANDS and DANGEROUS_COMMANDS.
🧹 Nitpick comments (7)
examples/README.md (1)
23-27: Add language specifier to fenced code blocks.Static analysis flagged code blocks without language specifiers. Use
textorplaintextfor ASCII diagrams and console output.📝 Proposed fix
-``` +```text my-project/ .autocoder/ allowed_commands.yaml ← Automatically created from template -``` +```Apply similar changes to the hierarchy diagram (line 122) and console output blocks (lines 504, 509).
security.py (3)
401-450: Consider logging validation failures for debugging.When config validation fails, the function silently returns
None. Users may not understand why their configuration is being ignored. Consider logging a warning when a config file exists but fails validation.🔧 Proposed improvement
+import logging + +logger = logging.getLogger(__name__) + def load_org_config() -> Optional[dict]: # ... existing code ... if not isinstance(config, dict): + logger.warning("Org config ignored: root must be a dict") return None if "version" not in config: + logger.warning("Org config ignored: missing 'version' field") return None
486-488: Silent rejection of config exceeding 50 commands.When the 50-command limit is exceeded, the entire config is silently ignored. Users would benefit from knowing why their config isn't taking effect.
684-703: Distinguish between hardcoded and org blocklist in error messages.The error message on line 689 says "blocked at organization level" but the command could also be in the hardcoded
BLOCKED_COMMANDS. Consider making the message more accurate.📝 Proposed improvement
# Check blocklist first (highest priority) if cmd in blocked_commands: + # Determine source of block for clearer messaging + if cmd in BLOCKED_COMMANDS or cmd in DANGEROUS_COMMANDS: + reason = f"Command '{cmd}' is permanently blocked and cannot be approved." + else: + reason = f"Command '{cmd}' is blocked at organization level and cannot be approved." return { "decision": "block", - "reason": f"Command '{cmd}' is blocked at organization level and cannot be approved.", + "reason": reason, }PHASE3_SPEC.md (2)
64-64: Add language identifiers to fenced code blocks.Several fenced code blocks lack language specifiers (e.g.,
textorplaintextfor ASCII diagrams/UX mockups). This helps with syntax highlighting and satisfies markdown lint rules.Example fix for lines 64, 88, 111, 123, 153
-``` +```text Agent is working...Also applies to: 88-88, 111-111, 123-123, 153-153
1418-1418: Add language identifier for the audit log example.-``` +```text [2026-01-22 10:30:45] User approved 'swift' (session-only)test_security.py (1)
405-458: Consider using a context manager or fixture for HOME override.The
test_org_config_loadingand similar tests directly modifyos.environ["HOME"]. While the cleanup is handled, consider extracting a helper context manager to reduce code duplication and ensure cleanup even on exceptions.Example helper context manager
from contextlib import contextmanager `@contextmanager` def override_home(new_home: str): """Temporarily override HOME environment variable.""" original_home = os.environ.get("HOME") os.environ["HOME"] = new_home try: yield finally: if original_home: os.environ["HOME"] = original_home else: del os.environ["HOME"]Usage:
with override_home(tmphome): # test code here
Implement hierarchical command security with project and org-level configs:
WHAT'S NEW:
- Project-level YAML config (.autocoder/allowed_commands.yaml)
- Organization-level config (~/.autocoder/config.yaml)
- Pattern matching (exact, wildcards, local scripts)
- Hardcoded blocklist (sudo, dd, shutdown - never allowed)
- Org blocklist (terraform, kubectl - configurable)
- Helpful error messages with config hints
- Comprehensive documentation and examples
ARCHITECTURE:
- Hierarchical resolution: Hardcoded → Org Block → Org Allow → Global → Project
- YAML validation with 50 command limit per project
- Pattern matching: exact ("swift"), wildcards ("swift*"), scripts ("./build.sh")
- Secure by default: all examples commented out
TESTING:
- 136 unit tests (pattern matching, YAML, hierarchy, validation)
- 9 integration tests (real security hook flows)
- All tests passing, 100% backward compatible
DOCUMENTATION:
- examples/README.md - comprehensive guide with use cases
- examples/project_allowed_commands.yaml - template (all commented)
- examples/org_config.yaml - org config template (all commented)
- PHASE3_SPEC.md - mid-session approval spec (future enhancement)
- Updated CLAUDE.md with security model documentation
USE CASES:
- iOS projects: Add Swift toolchain (xcodebuild, swift*, etc.)
- Rust projects: Add cargo, rustc, clippy
- Enterprise: Block aws, kubectl, terraform org-wide
- Custom scripts: Allow ./scripts/build.sh
PHASES:
✅ Phase 1: Project YAML + blocklist (implemented)
✅ Phase 2: Org config + hierarchy (implemented)
📋 Phase 3: Mid-session approval (spec ready, not implemented)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add validation to reject bare wildcards for security: - matches_pattern(): return False if pattern == '*' - validate_project_command(): reject name == '*' with clear error - Added 4 new tests for bare wildcard rejection This prevents a config with from matching every command, which would be a major security risk. Tests: 140 unit tests passing (added 4 bare wildcard tests)
1786c9d to
d1dac13
Compare
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
CLAUDE.md (1)
64-64: Add languages to fenced code blocks (MD040).
A few fences are missing language tags; addingbash,yaml, ortextwill satisfy markdownlint.🛠️ Example fix pattern
-``` +```text ... -``` +```Also applies to: 88-88, 111-111, 123-123, 153-153, 1418-1418
security.py (1)
655-715: Clarify block reason and approval guidance.
blocked_commandsincludes hardcoded + dangerous entries, so “blocked at organization level” is misleading. Also, the error text suggests mid‑session approval even though it’s not implemented. Consider neutral wording or flagging it as “planned.”🔧 Suggested fix
- "reason": f"Command '{cmd}' is blocked at organization level and cannot be approved.", + "reason": f"Command '{cmd}' is blocked by policy and cannot be approved.", } @@ - error_msg += " 2. Request mid-session approval (the agent can ask)\n" + error_msg += " 2. Request mid-session approval (planned; not yet available)\n"
🤖 Fix all issues with AI agents
In `@examples/README.md`:
- Around line 23-27: The fenced code blocks in examples/README.md are missing
language identifiers (MD040); update each triple-backtick block (including the
shown block and those at the ranges referenced: 122-152, 504-515, 509-515) to
include an appropriate language token (e.g., ```text, ```bash, or ```yaml) and
ensure each opening fence has a matching closing fence so markdownlint MD040 is
satisfied; search for the plain ``` blocks in README.md and replace them with
the correct fenced code labels for their content.
In `@PHASE3_SPEC.md`:
- Line 64: Several fenced code blocks in PHASE3_SPEC.md are missing language
identifiers; update each missing triple-backtick fence to include the
appropriate language token (e.g., text, bash, yaml, python, typescript) so
Markdown lint rule MD040 is satisfied. Locate the unlabeled fences indicated in
the review and for each decide the correct language context (examples: plain
output -> text, shell commands -> bash, config -> yaml, snippets ->
python/typescript) and add the language immediately after the opening ``` marker
for each block.
- Around line 1552-1567: Replace the emphasized "Total: 24-36 hours (3-4.5
days)" line under the "Estimated Timeline" section with a proper Markdown
heading (e.g., prepend one to three # characters) so it is a heading instead of
bold text; locate the line directly after the timeline table in PHASE3_SPEC.md
and update it to a heading-level that matches the document structure (for
example "### Total: 24-36 hours (3-4.5 days)").
In `@security.py`:
- Around line 358-396: The path-matching branch in matches_pattern only handles
patterns starting with "./" or "../", so patterns like "scripts/test.sh" never
match; change the condition to treat any pattern containing a "/" as a path
pattern (e.g., if "/" in pattern) and keep the existing behavior: extract
pattern_name via os.path.basename(pattern) and return True when command equals
pattern, equals pattern_name, or endswith("/" + pattern_name); leave the
existing wildcard and exact-match checks (and the "*" guard) intact.
- Around line 399-458: load_org_config currently validates allowed_commands
entries only for dict presence and "name" key, allowing non-string or empty
names to propagate into matches_pattern; update load_org_config so that inside
the allowed_commands loop (the block referencing "allowed" and iterating "for
cmd in allowed") you reject any entry where cmd["name"] is not a non-empty
string (e.g., not isinstance(cmd["name"], str) or cmd["name"].strip() == ""),
returning None for an invalid config; keep other existing structure checks
(dict/list) and ensure the change prevents non-string names from reaching
matches_pattern.
In `@test_security_integration.py`:
- Around line 201-241: The test mutates HOME directly and may skip restoration
if bash_security_hook raises; wrap the environment changes in a safe context
manager or use pytest's monkeypatch to set and restore HOME (and on Windows also
USERPROFILE, HOMEDRIVE, HOMEPATH) for the scope of the test; update the block
around tempfile.TemporaryDirectory() and the call to bash_security_hook (and the
similar block in test_org_allowlist_inheritance) to use a try/finally or a
contextmanager that saves original env values and restores them regardless of
exceptions, referencing the test function and the bash_security_hook call to
locate where to apply the change.
Changes: - Increase command limit from 50 to 100 per project - Add examples/OPTIMIZE_CONFIG.md with optimization strategies - Update all documentation references (50 → 100) - Update tests for new limit Rationale: - 50 was too restrictive for projects with many tools (Flutter, etc.) - Users were unknowingly exceeding limit by listing subcommands - 100 provides headroom while maintaining security - New guide teaches wildcard optimization (flutter* vs listing each subcommand) UI feedback idea: Show command count and optimization suggestions (tracked for Phase 3 or future enhancement)
Changes: - Support path patterns without ./ prefix (e.g., 'scripts/test.sh') - Reject non-string or empty command names in org config - Add 8 new test cases (5 for path patterns, 3 for validation) Details: - matches_pattern() now treats any pattern with '/' as a path pattern - load_org_config() validates that cmd['name'] is a non-empty string - All 148 unit tests + 9 integration tests passing Security hardening: Prevents invalid command names from reaching pattern matching logic, reducing attack surface.
Changes: - Add temporary_home() context manager for safe HOME manipulation - Handle both Unix (HOME) and Windows (USERPROFILE, HOMEDRIVE, HOMEPATH) - Update test_org_blocklist_enforcement to use context manager - Update test_org_allowlist_inheritance to use context manager Benefits: - Environment variables always restored, even on exceptions - Prevents test pollution across test runs - Cross-platform compatibility (Windows + Unix) All 9 integration tests passing.
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Fix all issues with AI agents
In `@CLAUDE.md`:
- Around line 241-247: Remove the hard-coded test counts next to the test
commands—specifically the "(136 tests - fast)" and "(9 tests - uses real hooks)"
annotations that follow the lines containing "python test_security.py" and
"python test_security_integration.py"; either delete the counts entirely or
replace them with a stable descriptor like "(unit tests)" and "(integration
tests - uses real hooks)" or a note that counts may change and to run the
commands to see current results.
In `@examples/OPTIMIZE_CONFIG.md`:
- Around line 111-117: The example uses a glob './scripts/*.sh' which the
current matcher/path-pattern logic does not support; fix by either updating the
example to list explicit script filenames instead of a glob (replace
'./scripts/*.sh' with the actual script names) or implement glob support in the
matcher by extending the path-pattern logic to recognize and expand patterns
like '*.sh' (update the matcher/path-pattern matching function to handle
wildcards and directory globs so './scripts/*.sh' will match files under
scripts/).
- Line 119: The bold-only emphasis line "**Reduced from 65 → 15 commands (77%
reduction!)**" should be converted to a real Markdown heading to satisfy MD036;
replace the bold markers with an appropriate heading level (e.g., "### Reduced
from 65 → 15 commands (77% reduction!)" or "## ...") and remove the surrounding
** so the text becomes a proper heading rather than bold-only emphasis.
- Around line 194-218: The fenced UI mockup blocks (the ones starting with "⚠️
Config Optimization Available" and "📊 Command Usage Stats") are missing
language identifiers which triggers MD040; add explicit language tags (e.g.,
```text) to each fenced code block so the fences are ```text ... ``` rather than
plain ``` to satisfy the linter and preserve formatting for the UI mockups.
In `@examples/README.md`:
- Around line 437-466: Update the README testing section under the "Unit Tests"
and "Integration Tests" headings: remove or update the stale numeric test count
("136 tests - fast") and change the "50 command limit" text to reflect the
current 100-command cap (e.g., "100 command limit" or simply "command limit
(100)"); alternatively drop exact test counts altogether to avoid future drift.
Locate the strings "Unit Tests" / "Integration Tests" in the diff and update the
lines describing the test counts and the command limit accordingly.
In `@test_security.py`:
- Around line 413-652: The tests fail on Windows because they only set HOME and
manually restore it; create a reusable temporary_home context manager (e.g.,
temporary_home) that sets HOME plus Windows equivalents USERPROFILE, HOMEDRIVE
and HOMEPATH to the temp dir and reliably restores original values in a
try/finally (or contextmanager) to avoid leaks on exceptions, then replace the
inline os.environ manipulation inside test_org_config_loading,
test_hierarchy_resolution, and test_org_blocklist_enforcement to use this
context (wrapping the whole test body) so Path.home() and Windows lookups pick
up the temp directory consistently.
| ```bash | ||
| # Unit tests (136 tests - fast) | ||
| python test_security.py | ||
|
|
||
| # Integration tests (9 tests - uses real hooks) | ||
| python test_security_integration.py | ||
| ``` |
There was a problem hiding this comment.
Avoid hard‑coding test counts in docs.
The counts can drift quickly; consider removing the numbers or ensuring they’re updated alongside test changes.
📝 Suggested doc adjustment
-# Unit tests (136 tests - fast)
+# Unit tests (fast)
...
-# Integration tests (9 tests - uses real hooks)
+# Integration tests (uses real hooks)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ```bash | |
| # Unit tests (136 tests - fast) | |
| python test_security.py | |
| # Integration tests (9 tests - uses real hooks) | |
| python test_security_integration.py | |
| ``` | |
| # Unit tests (fast) | |
| python test_security.py | |
| # Integration tests (uses real hooks) | |
| python test_security_integration.py |
🤖 Prompt for AI Agents
In `@CLAUDE.md` around lines 241 - 247, Remove the hard-coded test counts next to
the test commands—specifically the "(136 tests - fast)" and "(9 tests - uses
real hooks)" annotations that follow the lines containing "python
test_security.py" and "python test_security_integration.py"; either delete the
counts entirely or replace them with a stable descriptor like "(unit tests)" and
"(integration tests - uses real hooks)" or a note that counts may change and to
run the commands to see current results.
| - name: ios-deploy | ||
| description: Deploy to iOS devices | ||
|
|
||
| # Project scripts | ||
| - name: ./scripts/*.sh | ||
| description: All project build/test scripts | ||
| ``` |
There was a problem hiding this comment.
Documented script globbing isn’t supported by the matcher.
./scripts/*.sh won’t match with the current path‑pattern logic (exact name only). Either update the example to explicit script names or implement glob support.
📝 Suggested doc adjustment
- - name: ./scripts/*.sh
- description: All project build/test scripts
+ - name: ./scripts/build.sh
+ description: Build script (repeat per script; globbing not supported yet)🤖 Prompt for AI Agents
In `@examples/OPTIMIZE_CONFIG.md` around lines 111 - 117, The example uses a glob
'./scripts/*.sh' which the current matcher/path-pattern logic does not support;
fix by either updating the example to list explicit script filenames instead of
a glob (replace './scripts/*.sh' with the actual script names) or implement glob
support in the matcher by extending the path-pattern logic to recognize and
expand patterns like '*.sh' (update the matcher/path-pattern matching function
to handle wildcards and directory globs so './scripts/*.sh' will match files
under scripts/).
| description: All project build/test scripts | ||
| ``` | ||
|
|
||
| **Reduced from 65 → 15 commands (77% reduction!)** |
There was a problem hiding this comment.
Convert bold-only emphasis to a real heading (MD036).
📝 Suggested doc adjustment
-**Reduced from 65 → 15 commands (77% reduction!)**
+#### Reduced from 65 → 15 commands (77% reduction!)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| **Reduced from 65 → 15 commands (77% reduction!)** | |
| #### Reduced from 65 → 15 commands (77% reduction!) |
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
119-119: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
🤖 Prompt for AI Agents
In `@examples/OPTIMIZE_CONFIG.md` at line 119, The bold-only emphasis line
"**Reduced from 65 → 15 commands (77% reduction!)**" should be converted to a
real Markdown heading to satisfy MD036; replace the bold markers with an
appropriate heading level (e.g., "### Reduced from 65 → 15 commands (77%
reduction!)" or "## ...") and remove the surrounding ** so the text becomes a
proper heading rather than bold-only emphasis.
| ``` | ||
| ⚠️ Config Optimization Available | ||
|
|
||
| Your config has 65 commands. We detected opportunities to reduce it: | ||
|
|
||
| • 25 flutter commands → Use flutter* (saves 24 entries) | ||
| • 8 dart commands → Use dart* (saves 7 entries) | ||
| • 12 adb commands → Use adb* (saves 11 entries) | ||
|
|
||
| [Optimize Automatically] [Keep As-Is] | ||
|
|
||
| Potential reduction: 65 → 23 commands | ||
| ``` | ||
|
|
||
| Or during config editing: | ||
|
|
||
| ``` | ||
| 📊 Command Usage Stats | ||
|
|
||
| flutter* : 25 subcommands detected | ||
| dart* : 8 subcommands detected | ||
| ./scripts/*.sh : 7 scripts detected | ||
|
|
||
| 💡 Tip: Using wildcards covers all subcommands automatically | ||
| ``` |
There was a problem hiding this comment.
Add language identifiers to fenced UI mockups (MD040).
📝 Suggested doc adjustment
-```
+```text
⚠️ Config Optimization Available
...
Potential reduction: 65 → 23 commands
-```
+```
...
-```
+```text
📊 Command Usage Stats
...
💡 Tip: Using wildcards covers all subcommands automatically
-```
+```📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ``` | |
| ⚠️ Config Optimization Available | |
| Your config has 65 commands. We detected opportunities to reduce it: | |
| • 25 flutter commands → Use flutter* (saves 24 entries) | |
| • 8 dart commands → Use dart* (saves 7 entries) | |
| • 12 adb commands → Use adb* (saves 11 entries) | |
| [Optimize Automatically] [Keep As-Is] | |
| Potential reduction: 65 → 23 commands | |
| ``` | |
| Or during config editing: | |
| ``` | |
| 📊 Command Usage Stats | |
| flutter* : 25 subcommands detected | |
| dart* : 8 subcommands detected | |
| ./scripts/*.sh : 7 scripts detected | |
| 💡 Tip: Using wildcards covers all subcommands automatically | |
| ``` |
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
194-194: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
210-210: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🤖 Prompt for AI Agents
In `@examples/OPTIMIZE_CONFIG.md` around lines 194 - 218, The fenced UI mockup
blocks (the ones starting with "⚠️ Config Optimization Available" and "📊
Command Usage Stats") are missing language identifiers which triggers MD040; add
explicit language tags (e.g., ```text) to each fenced code block so the fences
are ```text ... ``` rather than plain ``` to satisfy the linter and preserve
formatting for the UI mockups.
| **Unit Tests** (136 tests - fast): | ||
| ```bash | ||
| source venv/bin/activate | ||
| python test_security.py | ||
| ``` | ||
|
|
||
| Tests: | ||
| - Pattern matching (exact, wildcards, scripts) | ||
| - YAML loading and validation | ||
| - Blocklist enforcement | ||
| - Project and org config hierarchy | ||
| - All existing security validations | ||
|
|
||
| **Integration Tests** (9 tests - uses real security hooks): | ||
| ```bash | ||
| source venv/bin/activate | ||
| python test_security_integration.py | ||
| ``` | ||
|
|
||
| Tests: | ||
| - Blocked commands are rejected (sudo, shutdown, etc.) | ||
| - Default commands work (ls, git, npm, etc.) | ||
| - Non-allowed commands are blocked (wget, python, etc.) | ||
| - Project config allows commands (swift, xcodebuild, etc.) | ||
| - Pattern matching works (swift* matches swiftlint) | ||
| - Org blocklist cannot be overridden | ||
| - Org allowlist is inherited by projects | ||
| - Invalid YAML is safely ignored | ||
| - 50 command limit is enforced | ||
|
|
There was a problem hiding this comment.
Update testing section to reflect the 100‑command limit and avoid stale counts.
The testing section still calls out “136 tests” and a “50 command limit,” which no longer matches the 100‑command cap described elsewhere. Consider removing exact counts or updating them, and fix the limit text.
📝 Suggested doc update
-**Unit Tests** (136 tests - fast):
+**Unit Tests** (fast):
...
-- 50 command limit is enforced
+- 100 command limit is enforced📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| **Unit Tests** (136 tests - fast): | |
| ```bash | |
| source venv/bin/activate | |
| python test_security.py | |
| ``` | |
| Tests: | |
| - Pattern matching (exact, wildcards, scripts) | |
| - YAML loading and validation | |
| - Blocklist enforcement | |
| - Project and org config hierarchy | |
| - All existing security validations | |
| **Integration Tests** (9 tests - uses real security hooks): | |
| ```bash | |
| source venv/bin/activate | |
| python test_security_integration.py | |
| ``` | |
| Tests: | |
| - Blocked commands are rejected (sudo, shutdown, etc.) | |
| - Default commands work (ls, git, npm, etc.) | |
| - Non-allowed commands are blocked (wget, python, etc.) | |
| - Project config allows commands (swift, xcodebuild, etc.) | |
| - Pattern matching works (swift* matches swiftlint) | |
| - Org blocklist cannot be overridden | |
| - Org allowlist is inherited by projects | |
| - Invalid YAML is safely ignored | |
| - 50 command limit is enforced | |
| **Unit Tests** (fast): |
🤖 Prompt for AI Agents
In `@examples/README.md` around lines 437 - 466, Update the README testing section
under the "Unit Tests" and "Integration Tests" headings: remove or update the
stale numeric test count ("136 tests - fast") and change the "50 command limit"
text to reflect the current 100-command cap (e.g., "100 command limit" or simply
"command limit (100)"); alternatively drop exact test counts altogether to avoid
future drift. Locate the strings "Unit Tests" / "Integration Tests" in the diff
and update the lines describing the test counts and the command limit
accordingly.
| def test_org_config_loading(): | ||
| """Test organization-level config loading.""" | ||
| print("\nTesting org config loading:\n") | ||
| passed = 0 | ||
| failed = 0 | ||
|
|
||
| # Save original org config path | ||
| original_home = Path.home() | ||
|
|
||
| with tempfile.TemporaryDirectory() as tmpdir: | ||
| # Temporarily override home directory for testing | ||
| import os | ||
| os.environ["HOME"] = tmpdir | ||
|
|
||
| org_dir = Path(tmpdir) / ".autocoder" | ||
| org_dir.mkdir() | ||
| org_config_path = org_dir / "config.yaml" | ||
|
|
||
| # Test 1: Valid org config | ||
| org_config_path.write_text("""version: 1 | ||
| allowed_commands: | ||
| - name: jq | ||
| description: JSON processor | ||
| blocked_commands: | ||
| - aws | ||
| - kubectl | ||
| """) | ||
| config = load_org_config() | ||
| if config and config["version"] == 1: | ||
| if len(config["allowed_commands"]) == 1 and len(config["blocked_commands"]) == 2: | ||
| print(" PASS: Load valid org config") | ||
| passed += 1 | ||
| else: | ||
| print(" FAIL: Load valid org config (wrong counts)") | ||
| failed += 1 | ||
| else: | ||
| print(" FAIL: Load valid org config") | ||
| print(f" Got: {config}") | ||
| failed += 1 | ||
|
|
||
| # Test 2: Missing file returns None | ||
| org_config_path.unlink() | ||
| config = load_org_config() | ||
| if config is None: | ||
| print(" PASS: Missing org config returns None") | ||
| passed += 1 | ||
| else: | ||
| print(" FAIL: Missing org config returns None") | ||
| failed += 1 | ||
|
|
||
| # Test 3: Non-string command name is rejected | ||
| org_config_path.write_text("""version: 1 | ||
| allowed_commands: | ||
| - name: 123 | ||
| description: Invalid numeric name | ||
| """) | ||
| config = load_org_config() | ||
| if config is None: | ||
| print(" PASS: Non-string command name rejected") | ||
| passed += 1 | ||
| else: | ||
| print(" FAIL: Non-string command name rejected") | ||
| print(f" Got: {config}") | ||
| failed += 1 | ||
|
|
||
| # Test 4: Empty command name is rejected | ||
| org_config_path.write_text("""version: 1 | ||
| allowed_commands: | ||
| - name: "" | ||
| description: Empty name | ||
| """) | ||
| config = load_org_config() | ||
| if config is None: | ||
| print(" PASS: Empty command name rejected") | ||
| passed += 1 | ||
| else: | ||
| print(" FAIL: Empty command name rejected") | ||
| print(f" Got: {config}") | ||
| failed += 1 | ||
|
|
||
| # Test 5: Whitespace-only command name is rejected | ||
| org_config_path.write_text("""version: 1 | ||
| allowed_commands: | ||
| - name: " " | ||
| description: Whitespace name | ||
| """) | ||
| config = load_org_config() | ||
| if config is None: | ||
| print(" PASS: Whitespace-only command name rejected") | ||
| passed += 1 | ||
| else: | ||
| print(" FAIL: Whitespace-only command name rejected") | ||
| print(f" Got: {config}") | ||
| failed += 1 | ||
|
|
||
| # Restore HOME | ||
| os.environ["HOME"] = str(original_home) | ||
|
|
||
| return passed, failed | ||
|
|
||
|
|
||
| def test_hierarchy_resolution(): | ||
| """Test command hierarchy resolution.""" | ||
| print("\nTesting hierarchy resolution:\n") | ||
| passed = 0 | ||
| failed = 0 | ||
|
|
||
| with tempfile.TemporaryDirectory() as tmphome: | ||
| with tempfile.TemporaryDirectory() as tmpproject: | ||
| # Setup fake home directory | ||
| import os | ||
| original_home = os.environ.get("HOME") | ||
| os.environ["HOME"] = tmphome | ||
|
|
||
| org_dir = Path(tmphome) / ".autocoder" | ||
| org_dir.mkdir() | ||
| org_config_path = org_dir / "config.yaml" | ||
|
|
||
| # Create org config with allowed and blocked commands | ||
| org_config_path.write_text("""version: 1 | ||
| allowed_commands: | ||
| - name: jq | ||
| description: JSON processor | ||
| - name: python3 | ||
| description: Python interpreter | ||
| blocked_commands: | ||
| - terraform | ||
| - kubectl | ||
| """) | ||
|
|
||
| project_dir = Path(tmpproject) | ||
| project_autocoder = project_dir / ".autocoder" | ||
| project_autocoder.mkdir() | ||
| project_config = project_autocoder / "allowed_commands.yaml" | ||
|
|
||
| # Create project config | ||
| project_config.write_text("""version: 1 | ||
| commands: | ||
| - name: swift | ||
| description: Swift compiler | ||
| """) | ||
|
|
||
| # Test 1: Org allowed commands are included | ||
| allowed, blocked = get_effective_commands(project_dir) | ||
| if "jq" in allowed and "python3" in allowed: | ||
| print(" PASS: Org allowed commands included") | ||
| passed += 1 | ||
| else: | ||
| print(" FAIL: Org allowed commands included") | ||
| print(f" jq in allowed: {'jq' in allowed}") | ||
| print(f" python3 in allowed: {'python3' in allowed}") | ||
| failed += 1 | ||
|
|
||
| # Test 2: Org blocked commands are in blocklist | ||
| if "terraform" in blocked and "kubectl" in blocked: | ||
| print(" PASS: Org blocked commands in blocklist") | ||
| passed += 1 | ||
| else: | ||
| print(" FAIL: Org blocked commands in blocklist") | ||
| failed += 1 | ||
|
|
||
| # Test 3: Project commands are included | ||
| if "swift" in allowed: | ||
| print(" PASS: Project commands included") | ||
| passed += 1 | ||
| else: | ||
| print(" FAIL: Project commands included") | ||
| failed += 1 | ||
|
|
||
| # Test 4: Global commands are included | ||
| if "npm" in allowed and "git" in allowed: | ||
| print(" PASS: Global commands included") | ||
| passed += 1 | ||
| else: | ||
| print(" FAIL: Global commands included") | ||
| failed += 1 | ||
|
|
||
| # Test 5: Hardcoded blocklist cannot be overridden | ||
| if "sudo" in blocked and "shutdown" in blocked: | ||
| print(" PASS: Hardcoded blocklist enforced") | ||
| passed += 1 | ||
| else: | ||
| print(" FAIL: Hardcoded blocklist enforced") | ||
| failed += 1 | ||
|
|
||
| # Restore HOME | ||
| if original_home: | ||
| os.environ["HOME"] = original_home | ||
| else: | ||
| del os.environ["HOME"] | ||
|
|
||
| return passed, failed | ||
|
|
||
|
|
||
| def test_org_blocklist_enforcement(): | ||
| """Test that org-level blocked commands cannot be used.""" | ||
| print("\nTesting org blocklist enforcement:\n") | ||
| passed = 0 | ||
| failed = 0 | ||
|
|
||
| with tempfile.TemporaryDirectory() as tmphome: | ||
| with tempfile.TemporaryDirectory() as tmpproject: | ||
| # Setup fake home directory | ||
| import os | ||
| original_home = os.environ.get("HOME") | ||
| os.environ["HOME"] = tmphome | ||
|
|
||
| org_dir = Path(tmphome) / ".autocoder" | ||
| org_dir.mkdir() | ||
| org_config_path = org_dir / "config.yaml" | ||
|
|
||
| # Create org config that blocks terraform | ||
| org_config_path.write_text("""version: 1 | ||
| blocked_commands: | ||
| - terraform | ||
| """) | ||
|
|
||
| project_dir = Path(tmpproject) | ||
| project_autocoder = project_dir / ".autocoder" | ||
| project_autocoder.mkdir() | ||
|
|
||
| # Try to use terraform (should be blocked) | ||
| input_data = {"tool_name": "Bash", "tool_input": {"command": "terraform apply"}} | ||
| context = {"project_dir": str(project_dir)} | ||
| result = asyncio.run(bash_security_hook(input_data, context=context)) | ||
|
|
||
| if result.get("decision") == "block": | ||
| print(" PASS: Org blocked command 'terraform' rejected") | ||
| passed += 1 | ||
| else: | ||
| print(" FAIL: Org blocked command 'terraform' should be rejected") | ||
| failed += 1 | ||
|
|
||
| # Restore HOME | ||
| if original_home: | ||
| os.environ["HOME"] = original_home | ||
| else: | ||
| del os.environ["HOME"] | ||
|
|
||
| return passed, failed |
There was a problem hiding this comment.
Make org‑config tests portable by isolating HOME (and Windows vars).
These tests only set HOME, but on Windows Path.home() uses USERPROFILE/HOMEDRIVE/HOMEPATH, so the temp config won’t be picked up. Also, manual restore can leak on exceptions. Consider a shared temporary_home() context (like the integration tests) and wrap these blocks.
🔧 Example refactor for one test (apply similarly)
- # Save original org config path
- original_home = Path.home()
-
- with tempfile.TemporaryDirectory() as tmpdir:
- # Temporarily override home directory for testing
- import os
- os.environ["HOME"] = tmpdir
+ with tempfile.TemporaryDirectory() as tmpdir:
+ # Temporarily override home directory for testing (cross‑platform)
+ with temporary_home(tmpdir):
org_dir = Path(tmpdir) / ".autocoder"
org_dir.mkdir()
org_config_path = org_dir / "config.yaml"
...
- # Restore HOME
- os.environ["HOME"] = str(original_home)🤖 Prompt for AI Agents
In `@test_security.py` around lines 413 - 652, The tests fail on Windows because
they only set HOME and manually restore it; create a reusable temporary_home
context manager (e.g., temporary_home) that sets HOME plus Windows equivalents
USERPROFILE, HOMEDRIVE and HOMEPATH to the temp dir and reliably restores
original values in a try/finally (or contextmanager) to avoid leaks on
exceptions, then replace the inline os.environ manipulation inside
test_org_config_loading, test_hierarchy_resolution, and
test_org_blocklist_enforcement to use this context (wrapping the whole test
body) so Path.home() and Windows lookups pick up the temp directory
consistently.
|
Thank you! |
Adds comprehensive test coverage for Keyboard Navigation feature and enhances CSS accessibility support. Test Files: - test_feature_80_keyboard_navigation.py: 43 unit tests for keyboard nav - verify_feature_80.py: Feature verification script CSS Enhancements (globals.css): - High contrast mode support (prefers-contrast: high) - Windows forced-colors mode support (forced-colors: active) - More/less contrast preferences (prefers-contrast: more/less) - Enhanced reduced motion support (prefers-reduced-motion: reduce) - Status indicator patterns for color-blind accessibility All 43 tests pass for Feature AutoForgeAI#80. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ands Per-Project Bash Command Allowlist System
The Problem
AutoCoder agents can only run hardcoded commands (npm, git, curl, etc.), blocking legitimate use cases:
swift,xcodebuild)cargo,rustc)Disclaimer: maybe there are ways of getting around it right now?
Also, I'm considering making updates to autocoder to not only create websites but other things.
Currently experimenting on APIs, CLI tool, and a flutter app
The Solution
Hierarchical, secure-by-default command allowlist system.
Project-Level Config
Organization-Level Config
What's Implemented
✅ Phase 1: Project YAML + Blocklist
swift*matches all Swift tools)✅ Phase 2: Org Config + Hierarchy
📋 Phase 3: Mid-Session Approval (Spec ready, not implemented)
PHASE3_SPEC.mdSecurity Hierarchy
Rule: Blocklist always wins.
Key Features
swift*matchesswift,swiftc,swiftlint./scripts/build.shworks from any directoryTesting
145 tests, all passing:
Documentation
examples/README.md- Comprehensive user guideexamples/project_allowed_commands.yaml- Template (all commented)examples/org_config.yaml- Org config templatePHASE3_SPEC.md- Mid-session approval spec (future)Files Changed
security.py(+361) - Core implementationtest_security.py(+488) - Unit teststest_security_integration.py(new) - Integration testsexamples/(new) - Documentation and templatesPHASE3_SPEC.md(new) - Future enhancement specTotal: +3,796 lines across 11 files
Why Phase 3 Isn't Included
Phase 3 (mid-session approval) is a UX enhancement, not a core requirement:
Try It
Impact
Enables AutoCoder to build ANY type of project (iOS, Rust, Python, Go) while maintaining security controls.
Summary by CodeRabbit
New Features
Documentation
Tests
✏️ Tip: You can customize this high-level summary in your review settings.