test(skills): integration tests for real-world skills.sh scripts by chaliy · Pull Request #292 · everruns/bashkit

chaliy · 2026-02-26T04:41:36Z

Summary

Add 10 real bash scripts from top skills.sh repos as test fixtures
Parse + execute them through bashkit with stubbed external binaries (az, helm, npm, curl, python3)
10/10 parse, 6/10 execute, 4 ignored with tracked bugs

What's tested

Script	Source	Bash features
azure_discover_rank.sh	microsoft/github-copilot-for-azure	`declare -A`, `${!MAP[@]}`, jq pipes, `set -euo pipefail`
helm_validate_chart.sh	wshobson/agents	functions, `command -v`, `grep -q`, `awk`, echo -e ANSI
jwt_test_setup.sh	giuseppe-trisciuoglio/developer-kit	`${var: -3}` substring, `local`, `trap EXIT`, `curl -w`
stitch_fetch.sh	nichochar/stitch-skills	curl wrapper, `$?` check
stitch_download_asset.sh	nichochar/stitch-skills	`dirname`, `mkdir -p`, `command -v`, `stat`
find_polluter.sh	nichochar/superpowers	`for`/`$(find)`, `$(( ))`, `wc

Bugs found

bug: backslash line continuation (\\newline) fails in some contexts #289: backslash line continuation fails in some parser contexts
bug: while/case arg parsing hits MaxLoopIterations for 5-iteration loop #290: while/case arg parsing loop hits MaxLoopIterations
bug: [ -f ] doesn't see VFS files after cd in script execution #291: [ -f ] doesn't see VFS files after cd in script execution

Test plan

cargo test --test skills_tests — 16 pass, 0 fail, 4 ignored
cargo test --test spec_tests — existing tests unaffected
cargo clippy -- -D warnings clean
cargo fmt --check clean

Analyze skills from skills.sh leaderboard across 12 repos to assess bash feature coverage. Key findings: - 66% are pure markdown (no scripts needed) - 97%+ of bash features used are supported by bashkit - Main gap is external binaries (LibreOffice, az CLI, etc.) - Only missing builtins: base64, curl -F multipart https://claude.ai/code/session_01CVF1zwHgALVKQnDrTBie9o

Key discoveries: - 250 leaderboard entries map to ~80 unique skills from ~25 repos (google-stitch: 72 entries → 6 skills; baoyu: 75 entries → 16 skills) - 63% pure markdown, 18% bash scripts, 14% TypeScript, 15% Python - Bash feature coverage: effectively 100% for all scripts observed - New pattern: TypeScript via `npx -y bun` (baoyu-skills, 97 .ts files) - New pattern: SKILL.md lifecycle hooks with bash (planning-with-files) - Missing builtins: base64, curl -F multipart, sed -i https://claude.ai/code/session_01CVF1zwHgALVKQnDrTBie9o

sed -i is fully implemented (sed.rs:216-217, all 75 tests pass). Removed from gaps list. Added note clarifying this. Issues filed: - #287: base64 builtin missing - #288: curl -F multipart support missing https://claude.ai/code/session_01CVF1zwHgALVKQnDrTBie9o

Extract 10 bash scripts from top skills.sh repos and run them through bashkit parser + interpreter with stubbed external binaries (az, helm, npm, curl, python3). Results: 10/10 parse, 6/10 execute, 4 ignored with tracked bugs. Parse tests verify every fixture parses cleanly. Execution tests use custom builtins (BashBuilder::builtin) to mock az CLI, helm, npm, curl etc. so we test bash feature coverage without real infrastructure. Bugs found and filed: - #289: backslash line continuation fails in some parser contexts - #290: while/case arg parsing loop hits MaxLoopIterations - #291: [ -f ] doesn't see VFS files after cd in script execution Scripts sourced from: - microsoft/github-copilot-for-azure (azure_*.sh) - vercel-labs/agent-skills (vercel_deploy.sh) - google-labs-code/stitch-skills (stitch_*.sh) - obra/superpowers (find_polluter.sh) - wshobson/agents (helm_validate_chart.sh) - giuseppe-trisciuoglio/developer-kit (jwt_test_setup.sh) https://claude.ai/code/session_01CVF1zwHgALVKQnDrTBie9o

Drop specs/015-skills-analysis.md (pure analysis doc). The value lives in the tests themselves — skills_tests.rs now has a full source table linking each fixture to its upstream repo. Also: fix clippy unused import, apply cargo fmt. https://claude.ai/code/session_01CVF1zwHgALVKQnDrTBie9o

claude added 5 commits February 26, 2026 01:54

chaliy merged commit d770547 into main Feb 26, 2026
16 checks passed

chaliy deleted the claude/analyze-skills-bash-MvtoF branch February 26, 2026 04:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(skills): integration tests for real-world skills.sh scripts#292

test(skills): integration tests for real-world skills.sh scripts#292
chaliy merged 5 commits intomainfrom
claude/analyze-skills-bash-MvtoF

chaliy commented Feb 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chaliy commented Feb 26, 2026

Summary

What's tested

Bugs found

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants