From f8e6c60be7d26ec2e881c375bf234e8a7bc38014 Mon Sep 17 00:00:00 2001 From: Claude Date: Sun, 1 Mar 2026 06:02:20 +0000 Subject: [PATCH 1/3] chore: adopt ship command and shipping definition Add /ship command adapted from everruns for bashkit's tooling (cargo/just, no UI/server, bashkit-specific smoke testing). Add "Shipping" definition to AGENTS.md. https://claude.ai/code/session_01Lbmx1UUYMfUkV5pL8Bw2v4 --- .claude/commands/ship.md | 108 +++++++++++++++++++++++++++++++++++++++ AGENTS.md | 6 +++ 2 files changed, 114 insertions(+) create mode 100644 .claude/commands/ship.md diff --git a/.claude/commands/ship.md b/.claude/commands/ship.md new file mode 100644 index 00000000..cb0689fc --- /dev/null +++ b/.claude/commands/ship.md @@ -0,0 +1,108 @@ +# Ship + +Run the full ship flow: verify quality, ensure test coverage, update artifacts, smoke test, then push, create PR, and merge when CI is green. + +This command implements the complete "Shipping" definition and Pre-PR Checklist from AGENTS.md. When the user says "ship" or "fix and ship", execute ALL phases below — not just the push/merge steps. + +## Arguments + +- `$ARGUMENTS` - Optional: description of what is being shipped (used for PR title/body context and to scope the quality checks) + +## Instructions + +### Phase 1: Pre-flight + +1. Confirm we're NOT on `main` or `master` +2. Confirm there are no uncommitted changes (`git diff --quiet && git diff --cached --quiet`) +3. If uncommitted changes exist, stop and tell the user + +### Phase 2: Test Coverage + +Review the changes on this branch (use `git diff origin/main...HEAD` and `git log origin/main..HEAD`) and ensure comprehensive test coverage: + +1. **Identify all changed code paths** — every new/modified function, module, builtin, tool +2. **Verify existing tests cover the changes** — run `cargo test --all-features` and check for failures +3. **Write missing tests** for any uncovered code paths: + - **Positive tests**: happy path, valid inputs, expected state transitions + - **Negative tests**: invalid inputs, error conditions, boundary cases, permission failures, missing resources + - **Security tests**: if change touches parser, interpreter, VFS, network, git, or user input — add tests per `specs/005-security-testing.md` + - **Compatibility tests**: if change affects Bash behavior parity — add differential tests comparing against real Bash +4. **Run all tests** to confirm green: `just test` +5. If any test fails, fix the code or test until green + +### Phase 3: Artifact Updates + +Review the changes and update project artifacts where applicable. Skip items that aren't affected. + +1. **Specs** (`specs/`): if the change adds/modifies behavior covered by a spec, update the relevant spec file to stay in sync +2. **Threat model** (`specs/006-threat-model.md`): if the change introduces new attack surfaces, external inputs, authentication/authorization changes, or data handling — add or update threat entries using the `TM--` format and add `// THREAT[TM-XXX-NNN]` code comments at mitigation points +3. **AGENTS.md**: if the change adds new specs, commands, or modifies development workflows — update the relevant section +4. **Implementation status** (`specs/009-implementation-status.md`): if feature status changed, update the status table +5. **Documentation** (`crates/bashkit/docs/`): if the change affects public APIs, tools, or features — update the relevant guide markdown files + +### Phase 4: Smoke Testing + +Smoke test impacted functionality to verify it works end-to-end: + +1. **CLI changes**: run `just run` with relevant commands, verify output +2. **Builtin/interpreter changes**: run example scripts via `just run-script ` to verify behavior +3. **Tool changes**: if LLM tool interface changed, run a quick tool invocation test +4. **Python bindings**: if Python code changed, run `ruff check crates/bashkit-python && ruff format --check crates/bashkit-python` + +If smoke testing reveals issues, fix them and loop back to Phase 2 (tests must still pass). + +### Phase 5: Quality Gates + +```bash +git fetch origin main && git rebase origin/main +``` + +- If rebase fails with conflicts, abort and tell the user to resolve manually + +```bash +just pre-pr +``` + +- If it fails, run `just fmt` to auto-fix, then retry once +- If still failing, stop and report + +### Phase 6: Push and PR + +```bash +git push -u origin +``` + +Check for existing PR: + +```bash +gh pr view --json url 2>/dev/null +``` + +If no PR exists, create one: + +- **Title**: conventional commit style from the branch commits +- **Body**: summary of What, Why, How, and what tests were added/verified +- Use `gh pr create` + +If a PR already exists, update it if needed and report its URL. + +### Phase 7: Wait for CI and Merge + +- Check CI status with `gh pr checks` (poll every 30s, up to 15 minutes) +- If CI is green, merge with `gh pr merge --squash --auto` +- If CI fails, report the failing checks and stop +- **NEVER** merge when CI is red + +### Phase 8: Post-merge + +After successful merge: + +- Report the merged PR URL +- Done + +## Notes + +- This is the canonical shipping workflow. It implements the full "Shipping" definition and Pre-PR Checklist from AGENTS.md. +- Phases 2-4 (tests, artifacts, smoke testing) are the quality core — do NOT skip them. +- The `$ARGUMENTS` context helps scope which tests, specs, and smoke tests are relevant. +- For "fix and ship" requests: implement the fix first, then run `/ship` to validate and merge. diff --git a/AGENTS.md b/AGENTS.md index 5ba0c8c5..b176fd37 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -94,6 +94,12 @@ just pre-pr # Pre-PR checks - Tests: `pytest crates/bashkit-python/tests/ -v` (requires `maturin develop` first) - CI: `.github/workflows/python.yml` (lint, test on 3.9/3.12/3.13, build wheel) +### Shipping + +"Ship" means: implement with extensive test coverage (positive and negative paths), then complete the full Pre-PR Checklist (especially smoke testing impacted functionality), create PR, and merge when CI is green. + +Use the `/ship` command (`.claude/commands/ship.md`) to execute the complete shipping workflow. For "fix and ship" requests: implement the fix first, then run `/ship`. + ### Pre-PR Checklist 1. `just pre-pr` (runs 2-4 automatically) From 81c0926f21397cc77aa6efe5772369096e15bf29 Mon Sep 17 00:00:00 2001 From: Claude Date: Sun, 1 Mar 2026 06:09:35 +0000 Subject: [PATCH 2/3] fix(curl): add missing cfg gate on resolve_redirect_url Function only used under http_client feature but lacked the gate, causing dead-code warning. https://claude.ai/code/session_01Lbmx1UUYMfUkV5pL8Bw2v4 --- crates/bashkit/src/builtins/curl.rs | 1 + 1 file changed, 1 insertion(+) diff --git a/crates/bashkit/src/builtins/curl.rs b/crates/bashkit/src/builtins/curl.rs index ae4f83dd..e2513732 100644 --- a/crates/bashkit/src/builtins/curl.rs +++ b/crates/bashkit/src/builtins/curl.rs @@ -570,6 +570,7 @@ fn guess_mime(path: &str) -> String { } /// Resolve a redirect URL which may be relative. +#[cfg(feature = "http_client")] fn resolve_redirect_url(base: &str, location: &str) -> String { if location.starts_with("http://") || location.starts_with("https://") { location.to_string() From 26ddb381ac4fb23eeef3b8e24244352ef5411a94 Mon Sep 17 00:00:00 2001 From: Claude Date: Sun, 1 Mar 2026 06:20:01 +0000 Subject: [PATCH 3/3] chore: add simplification and security review phases to ship command Add Phase 3b (code simplification) and Phase 3c (security review) to the shipping workflow, ensuring changed code is reviewed for duplication, complexity, and security vulnerabilities before merge. https://claude.ai/code/session_01Lbmx1UUYMfUkV5pL8Bw2v4 --- .claude/commands/ship.md | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/.claude/commands/ship.md b/.claude/commands/ship.md index cb0689fc..11aef0e5 100644 --- a/.claude/commands/ship.md +++ b/.claude/commands/ship.md @@ -40,6 +40,31 @@ Review the changes and update project artifacts where applicable. Skip items tha 4. **Implementation status** (`specs/009-implementation-status.md`): if feature status changed, update the status table 5. **Documentation** (`crates/bashkit/docs/`): if the change affects public APIs, tools, or features — update the relevant guide markdown files +### Phase 3b: Code Simplification + +Review all changed code for opportunities to simplify: + +1. **Identify duplication** — look for repeated patterns that could share a helper or be consolidated +2. **Reduce complexity** — simplify nested logic, long match arms, deeply indented blocks +3. **Remove dead code** — unused functions, unreachable branches, commented-out code +4. **Check naming** — ensure functions, variables, and types have clear, descriptive names +5. **Verify no over-engineering** — remove unnecessary abstractions, feature flags, or indirection that don't serve the current change + +If simplification changes are made, loop back to Phase 2 to verify tests still pass. + +### Phase 3c: Security Review + +Analyze all changed code for security vulnerabilities: + +1. **Input validation** — check that user-supplied data (script input, file paths, environment variables, command arguments) is validated before use +2. **Injection risks** — look for command injection, path traversal, environment variable injection, or shell metacharacter issues +3. **Sandbox escapes** — if changes touch VFS, builtins, or process execution, verify they cannot escape the sandbox (see `specs/006-threat-model.md`) +4. **Resource exhaustion** — check for unbounded loops, unbounded allocations, or missing limits on user-controlled sizes +5. **Error handling** — ensure errors don't leak internal state, file paths, or sensitive information +6. **Unsafe code** — review any `unsafe` blocks for soundness; prefer safe alternatives + +If security issues are found, fix them, add regression tests, and update `specs/006-threat-model.md` if a new threat category is identified. + ### Phase 4: Smoke Testing Smoke test impacted functionality to verify it works end-to-end: @@ -103,6 +128,6 @@ After successful merge: ## Notes - This is the canonical shipping workflow. It implements the full "Shipping" definition and Pre-PR Checklist from AGENTS.md. -- Phases 2-4 (tests, artifacts, smoke testing) are the quality core — do NOT skip them. +- Phases 2-4 (tests, artifacts, simplification, security review, smoke testing) are the quality core — do NOT skip them. - The `$ARGUMENTS` context helps scope which tests, specs, and smoke tests are relevant. - For "fix and ship" requests: implement the fix first, then run `/ship` to validate and merge.