fix: add shlex fallback parser and heredoc warning#127
Conversation
- Add _extract_primary_command() fallback when shlex.split() fails on complex nested quotes (e.g., docker exec with PHP) - Returns primary command instead of empty list, allowing valid commands to proceed - Add heredoc warning to coding prompt - sandbox blocks /tmp access for here documents - All 162 security tests pass
📝 WalkthroughWalkthroughTwo focused updates improve commit message guidance and command extraction robustness. The template file now includes detailed git commit rules with examples, while the security module gains a fallback helper function to handle malformed command segments gracefully instead of failing. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@security.py`:
- Around line 211-217: In the except ValueError block that currently calls
_extract_primary_command(segment) and appends fallback_cmd to commands, add an
explicit security-level log entry (including timestamp, the original segment,
the fallback_cmd, and context like caller or request id if available) whenever
fallback extraction is used; ensure the logger used is the module/application
security/audit logger and that the message clearly flags this as a fallback
extraction for malformed input so these events are auditable and easy to search
later.
🧹 Nitpick comments (1)
security.py (1)
198-198: Minor note: consider removing the comment.The comment
# (re is already imported at module level)is helpful during review but may become stale or add noise over time. Consider removing it since the import at line 10 is self-evident.
| except ValueError: | ||
| # Malformed command (unclosed quotes, etc.) | ||
| # Return empty to trigger block (fail-safe) | ||
| return [] | ||
| # Try fallback extraction instead of blocking entirely | ||
| fallback_cmd = _extract_primary_command(segment) | ||
| if fallback_cmd: | ||
| commands.append(fallback_cmd) | ||
| continue |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's locate the security.py file and understand the structure
find . -name "security.py" -type f | head -5Repository: leonvanzyl/autocoder
Length of output: 77
🏁 Script executed:
# Find test files related to security
find . -name "*test*.py" -o -name "test_*" -type f | grep -i security | head -10Repository: leonvanzyl/autocoder
Length of output: 113
🏁 Script executed:
# Search for the _extract_primary_command function and related code
rg -n "_extract_primary_command|extract_commands" --type=py | head -20Repository: leonvanzyl/autocoder
Length of output: 637
🏁 Script executed:
# Search for the bash_security_hook and understand the fail-safe mechanism
rg -n "bash_security_hook|ValueError.*shlex" --type=py -A 3 -B 3Repository: leonvanzyl/autocoder
Length of output: 11785
🏁 Script executed:
# Read the security.py file to understand the fallback mechanism and the surrounding context
cat -n security.py | sed -n '143,220p'Repository: leonvanzyl/autocoder
Length of output: 2848
🏁 Script executed:
# Look at the blocklist and how it's enforced
rg -n "HARDCODED_BLOCKLIST|BLOCKLIST|sudo|dd|shutdown" security.py -A 2 -B 2 | head -50Repository: leonvanzyl/autocoder
Length of output: 1513
🏁 Script executed:
# Examine the test_extract_commands function to see what's tested
cat -n test_security.py | sed -n '97,150p'Repository: leonvanzyl/autocoder
Length of output: 2292
🏁 Script executed:
# Search for tests with malformed/unclosed commands
rg -n "unclosed|malformed|shlex|ValueError" test_security.py -A 3 -B 2Repository: leonvanzyl/autocoder
Length of output: 46
🏁 Script executed:
# Check if there are any tests with chained commands and malformed input
rg -n "&&|;|pipe.*unclosed|malformed.*&&" test_security.py -A 5 -B 5Repository: leonvanzyl/autocoder
Length of output: 2062
🏁 Script executed:
# See how the blocklist is enforced in bash_security_hook
cat -n security.py | sed -n '787,850p'Repository: leonvanzyl/autocoder
Length of output: 3020
🏁 Script executed:
# Look at the full validate_command function to understand blocklist enforcement
rg -n "def validate_command|BLOCKED_COMMANDS|DANGEROUS_COMMANDS" security.py -A 5 | head -100Repository: leonvanzyl/autocoder
Length of output: 1453
🏁 Script executed:
# Check if there are any tests with malformed input (unclosed quotes)
rg -n "unclosed|malformed|shlex.*fail|ValueError" test_security.pyRepository: leonvanzyl/autocoder
Length of output: 46
🏁 Script executed:
# Look for tests that specifically test the fallback mechanism
rg -n "fallback|_extract_primary_command" test_security.pyRepository: leonvanzyl/autocoder
Length of output: 46
🏁 Script executed:
# Search entire codebase for any tests with malformed commands containing chained operations
rg -n '".*&&.*".*unclosed|unclosed.*&&' test_security.py test_security_integration.pyRepository: leonvanzyl/autocoder
Length of output: 46
🏁 Script executed:
# Let me check if there are any tests with malformed commands in general
rg -n 'unclosed|malformed|".*\\$' test_security.py test_security_integration.py -A 3 -B 1Repository: leonvanzyl/autocoder
Length of output: 46
🏁 Script executed:
# Check if the fallback extraction is tested with any dangerous commands
rg -n 'sudo|dd|shutdown' test_security.py -A 2 -B 2Repository: leonvanzyl/autocoder
Length of output: 1638
🏁 Script executed:
# Let's trace through a specific test case with chained commands to understand the extraction
cat -n test_security.py | sed -n '1075,1130p'Repository: leonvanzyl/autocoder
Length of output: 660
🏁 Script executed:
# Check if there are any edge case tests with malformed input
rg -n "test.*malformed|test.*unclosed|test.*fallback" test_security.py -A 10Repository: leonvanzyl/autocoder
Length of output: 46
🏁 Script executed:
# Let me examine the complete extract_commands function to see all code paths
cat -n security.py | sed -n '182,280p'Repository: leonvanzyl/autocoder
Length of output: 3457
🏁 Script executed:
# Let me check what the code did BEFORE the fallback was introduced
# by looking at git history to see the original behavior
git log --oneline -20 -- security.pyRepository: leonvanzyl/autocoder
Length of output: 122
🏁 Script executed:
# Check if there are any tests that specifically test the behavior when extract_commands returns empty
rg -n "extract_commands.*\[\]|empty.*commands|extract.*fail" test_security.py -B 3 -A 3Repository: leonvanzyl/autocoder
Length of output: 46
🏁 Script executed:
# Let's create a test case to verify the concern - malformed input with chained dangerous commands
python3 << 'EOF'
import sys
sys.path.insert(0, '.')
from security import extract_commands, _extract_primary_command
# Test case 1: Malformed segment with chained dangerous command
test_cases = [
# Normal cases
('docker exec php -r "echo test"', "normal docker with unclosed quote simulation"),
# Malformed with chained command
('valid_cmd "unclosed && sudo rm -rf /', "chained dangerous after unclosed quote"),
('ls "unclosed && dd if=/dev/zero', "dd chained after unclosed quote"),
('echo "test && shutdown -h now', "shutdown chained after unclosed quote"),
]
print("Testing extract_commands with malformed + chained dangerous commands:\n")
for cmd, description in test_cases:
result = extract_commands(cmd)
print(f"Input: {cmd!r}")
print(f"Description: {description}")
print(f"Extracted commands: {result}")
print()
EOFRepository: leonvanzyl/autocoder
Length of output: 631
🏁 Script executed:
# Check if the fallback was recently added or if it was always there
git log -p -- security.py | grep -A 10 -B 10 "_extract_primary_command" | head -50Repository: leonvanzyl/autocoder
Length of output: 46
🏁 Script executed:
# See when the fallback logic was introduced
git log --oneline --all | head -5Repository: leonvanzyl/autocoder
Length of output: 402
🏁 Script executed:
# Check what happens when extract_commands returns empty list in bash_security_hook
cat -n security.py | sed -n '808,820p'Repository: leonvanzyl/autocoder
Length of output: 574
🏁 Script executed:
# Let me test what happens with a pure malformed command (no valid first command)
python3 << 'EOF'
import sys
sys.path.insert(0, '.')
from security import extract_commands
# Test cases where malformed has NO valid command
test_cases = [
('&&& malformed', "operators only"),
('"unclosed only', "just unclosed quote"),
('; ; ;', "only operators"),
('|||', "only pipes"),
]
print("Testing extract_commands with ONLY malformed syntax:\n")
for cmd, description in test_cases:
result = extract_commands(cmd)
print(f"Input: {cmd!r}")
print(f"Description: {description}")
print(f"Extracted commands: {result}")
print(f"Length: {len(result)}")
print()
EOFRepository: leonvanzyl/autocoder
Length of output: 451
🏁 Script executed:
# Check the commit that introduced the fallback to understand the original behavior
git show d47028d --statRepository: leonvanzyl/autocoder
Length of output: 11619
🏁 Script executed:
# Let's see the actual changes made in that commit
git show d47028dRepository: leonvanzyl/autocoder
Length of output: 50380
🏁 Script executed:
# Check what tests exist for malformed input and fallback
rg -n "test.*fallback|test.*malformed|test.*unclosed|test.*shlex" test_security.py -A 5 -B 2Repository: leonvanzyl/autocoder
Length of output: 46
🏁 Script executed:
# Verify that all BLOCKED_COMMANDS and DANGEROUS_COMMANDS are in the blocklist enforcement
cat -n security.py | sed -n '72,115p'Repository: leonvanzyl/autocoder
Length of output: 1268
🏁 Script executed:
# Verify the blocklist is actually enforced in bash_security_hook
cat -n security.py | sed -n '834,860p'Repository: leonvanzyl/autocoder
Length of output: 1541
🏁 Script executed:
# Test if a dangerous command can bypass when malformed
python3 << 'EOF'
import sys
sys.path.insert(0, '.')
from security import extract_commands, get_effective_commands
# Test if sudo/dd can bypass in malformed segments
malformed_tests = [
'valid_cmd "unclosed && sudo apt install',
'echo test; sudo rm -rf /',
'ls "unclosed; dd if=/dev/zero',
]
dangerous_set = {"sudo", "dd", "shutdown", "reboot"}
print("Testing if dangerous commands extracted from malformed segments:\n")
for cmd in malformed_tests:
extracted = extract_commands(cmd)
has_dangerous = any(c in dangerous_set for c in extracted)
print(f"Input: {cmd!r}")
print(f"Extracted: {extracted}")
print(f"Contains dangerous: {has_dangerous}")
print()
# Check the blocklist
allowed, blocked = get_effective_commands(None)
print("Dangerous commands in blocklist:")
for cmd in ["sudo", "dd", "shutdown", "reboot", "su"]:
in_blocked = cmd in blocked
print(f" {cmd}: {in_blocked}")
EOFRepository: leonvanzyl/autocoder
Length of output: 507
🏁 Script executed:
# Let me verify the actual security concern more thoroughly
# Test the specific example from the review comment
python3 << 'EOF'
import sys
sys.path.insert(0, '.')
import asyncio
from security import bash_security_hook
test_cases = [
# The example from the review comment
'docker exec php -r "unclosed && sudo rm -rf /',
# Variations
'valid_cmd "unclosed && sudo apt install package',
'echo test; sudo rm -rf /', # This should work (extracted)
'ls "unclosed && shutdown -h now',
]
async def test_command(cmd):
input_data = {"tool_name": "Bash", "tool_input": {"command": cmd}}
result = await bash_security_hook(input_data)
decision = result.get("decision", "allow")
reason = result.get("reason", "allowed by default")
return decision, reason
print("Testing bash_security_hook with malformed + dangerous commands:\n")
for cmd in test_cases:
decision, reason = asyncio.run(test_command(cmd))
print(f"Command: {cmd!r}")
print(f"Decision: {decision}")
if decision == "block":
print(f"Reason: {reason[:100]}...")
print()
EOFRepository: leonvanzyl/autocoder
Length of output: 586
🏁 Script executed:
# Check if there are any tests currently for the fallback behavior
grep -r "fallback\|_extract_primary_command" test_security.py test_security_integration.pyRepository: leonvanzyl/autocoder
Length of output: 46
🏁 Script executed:
# Look for any TODO or FIXME comments about the fallback
rg -n "TODO|FIXME|XXX" security.py | head -20Repository: leonvanzyl/autocoder
Length of output: 46
🏁 Script executed:
# Let me trace through the logic more carefully for the docker case
python3 << 'EOF'
import sys
sys.path.insert(0, '.')
from security import extract_commands, is_command_allowed, get_effective_commands
# The critical case
cmd = 'docker exec php -r "unclosed && sudo rm -rf /'
print(f"Testing: {cmd!r}\n")
# Step 1: What gets extracted?
extracted = extract_commands(cmd)
print(f"Step 1 - Extracted commands: {extracted}")
# Step 2: Is docker allowed?
allowed, blocked = get_effective_commands(None)
docker_allowed = is_command_allowed("docker", allowed)
print(f"Step 2 - Is 'docker' allowed? {docker_allowed}")
print(f" Docker in allowed: {'docker' in allowed}")
print(f" Docker in blocked: {'docker' in blocked}")
# Step 3: Was sudo extracted?
sudo_extracted = "sudo" in extracted
print(f"Step 3 - Was 'sudo' extracted? {sudo_extracted}")
print(f" Sudo in blocked: {'sudo' in blocked}")
print(f"\nResult: Command ALLOWED because docker is allowed and sudo was NOT extracted")
print(f"Risk: sudo is present in the command string but never validated")
EOFRepository: leonvanzyl/autocoder
Length of output: 473
🏁 Script executed:
# Check what's in the default allowlist for docker
rg -n "docker" security.py | head -20Repository: leonvanzyl/autocoder
Length of output: 168
🏁 Script executed:
# Verify the default allowed commands
cat -n security.py | sed -n '35,70p'Repository: leonvanzyl/autocoder
Length of output: 1116
Add logging when fallback command extraction is used to ensure security-critical misses are auditable.
The fallback extraction for malformed segments (e.g., docker exec php -r "unclosed && sudo rm -rf /) can allow dangerous commands to bypass validation. In the example above:
dockeris extracted and allowedsudoinside the malformed quote is never extracted or validated- The command passes the security check despite containing a blocked command
While such malformed input typically fails shell parsing as well (defense-in-depth), this creates an unauditable security blind spot. The original fail-safe behavior blocked unparseable commands entirely; the new behavior extracts only the primary command and allows it if safe.
Consider adding explicit logging (with timestamp and command context) when fallback extraction is triggered, so security-sensitive operations are auditable. Alternatively, document this tradeoff as an accepted risk in the project's security model.
🤖 Prompt for AI Agents
In `@security.py` around lines 211 - 217, In the except ValueError block that
currently calls _extract_primary_command(segment) and appends fallback_cmd to
commands, add an explicit security-level log entry (including timestamp, the
original segment, the fallback_cmd, and context like caller or request id if
available) whenever fallback extraction is used; ensure the logger used is the
module/application security/audit logger and that the message clearly flags this
as a fallback extraction for malformed input so these events are auditable and
easy to search later.
|
Thanks |
- Add debug logging when shlex fallback extraction is used, capturing both successful extractions and failures for security auditing - Add test case for docker nested quotes that trigger fallback parser - Remove redundant comment about re import (already at module level) Follow-up improvements from PR #127 code review: - Enables tracking of malformed command patterns in production logs - Verifies fallback parser handles the exact docker exec case reported Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ture AutoForgeAI#127) - 63 tests verifying all 9 feature steps for FeatureCompiler.compile() - Validates task_type mapping, tool_policy, budgets, acceptance_spec, traceability - Confirms 5 distinct task_types across multi-feature DB run (coding, testing, audit, documentation, refactoring) - Zero regressions in existing 46-test E2E suite Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…shlex-fallback fix: add shlex fallback parser and heredoc warning
- Add debug logging when shlex fallback extraction is used, capturing both successful extractions and failures for security auditing - Add test case for docker nested quotes that trigger fallback parser - Remove redundant comment about re import (already at module level) Follow-up improvements from PR AutoForgeAI#127 code review: - Enables tracking of malformed command patterns in production logs - Verifies fallback parser handles the exact docker exec case reported Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
Two focused bug fixes discovered from production log analysis:
shlex fallback parser - Complex docker commands with nested quotes (e.g.,
docker exec container php -r "echo \"test\";") causedshlex.split()to fail, which previously returned an empty list and blocked the entire command. Now extracts the primary command via fallback parsing.heredoc warning - Sandbox mode blocks
/tmpaccess, causing heredoc-style git commits to fail with "can't create temp file for here document". Added clear guidance to use multiple-mflags instead.Changes
security.py_extract_primary_command()fallback when shlex fails.claude/templates/coding_prompt.template.mdTest plan
extract_commands('docker exec container php -r "echo \"test\";"')returns['docker'](was[])ruff check security.pypasses🤖 Generated with Claude Code
Summary by CodeRabbit
Documentation
Bug Fixes
✏️ Tip: You can customize this high-level summary in your review settings.