Skip to content

Feat/golden file acceptance testing#2

Merged
K1-R1 merged 3 commits intomainfrom
feat/golden-file-acceptance-testing
Mar 15, 2026
Merged

Feat/golden file acceptance testing#2
K1-R1 merged 3 commits intomainfrom
feat/golden-file-acceptance-testing

Conversation

@K1-R1
Copy link
Owner

@K1-R1 K1-R1 commented Mar 15, 2026

This pull request introduces a comprehensive acceptance test runbook for smoosh, expands the golden test fixture repository to cover more scenarios, and improves test coverage and documentation around output correctness and secrets detection. The most important changes are grouped below.

Documentation and Testing Improvements

  • Added a detailed manual acceptance test runbook in test/ACCEPTANCE.md, covering interactive mode, remote repo processing, AI tool uploads, secrets detection, CI usage, and golden file regression scenarios.
  • Expanded the test coverage count in README.md from 198 to 228, reflecting new tests and improved file inclusion verification.
  • Added documentation in CONTRIBUTING.md for golden file tests, including instructions for regenerating expected outputs and reviewing diffs for intentional changes.

Golden Test Fixture Repository Expansion

  • Added diverse files to test/fixtures/golden-repo/ including code (app.py, main.go, index.js), docs (README.md, guide.rst, manual.adoc, journal.org, doc.md, it's-fine.md, cdata-break.md), config (Makefile, .env.example, .github/ci.yml), binary (logo.png), and secrets (aws-creds.py) to exercise all smoosh capabilities and edge cases.

Output Correctness and Cross-Platform Consistency

  • Improved add_line_numbers in smoosh to ensure consistent output formatting across GNU and BSD nl implementations by stripping trailing newlines and using printf '%s'.
  • Changed file content processing in smoosh to use cat | sed instead of sed -- file, fixing compatibility with BSD sed which does not support -- as end-of-options.

These changes collectively enhance the reliability, coverage, and maintainability of smoosh's test infrastructure and documentation.

K1-R1 added 3 commits March 15, 2026 12:25
Introduce a comprehensive golden file test suite that verifies smoosh
output is byte-for-byte correct across all modes, formats, and feature
combinations. Adds 17 new @test blocks (215 total, up from 198).

- test/fixtures/golden-repo/ — 19 fixture files (no .git/) covering
  docs, code, config, hidden files, edge cases (empty, no-newline,
  CDATA, deeply nested paths, secrets pattern, unicode-safe content)
- test/smoosh_golden.bats — golden test runner with normalise() to
  strip timestamps and temp paths, UPDATE_GOLDEN=1 to regenerate, and
  Bash 3.2-compatible chunked test using find + while loop
- test/golden/expected/ — 17 committed expected output files covering
  docs/code/all modes, text/xml/md formats, TOC, line numbers, filter
  flags (--only, --exclude, --include), chunking, hidden files, and
  the stdout/stderr outputs (--dry-run, --json, --quiet)
- test/ACCEPTANCE.md — manual test runbook for interactive mode,
  remote repo, NotebookLM, Claude Projects, ChatGPT, secrets
  detection, agent/CI usage, and golden file regression
- CONTRIBUTING.md — UPDATE_GOLDEN=1 workflow documented
- README.md — test count updated from 198 to 215
- .gitignore — negation rule for test/fixtures/golden-repo/.env.example

Signed-off-by: K1-R1 <77465250+K1-R1@users.noreply.github.com>
Add 13 new golden tests (30 total) and 3 new fixture files to achieve
complete byte-level coverage of every deterministic smoosh code path.

Format × modifier matrix — fill every cell:
- docs-text-toc, docs-text-line-numbers, docs-text-toc-line-numbers
- docs-xml-line-numbers, docs-xml-toc-line-numbers

Output channel / special mode gaps:
- json-dry-run: distinct JSON structure (dry_run:true, files[], no chunks)
- code-json: secrets_excluded array populated (aws-creds.py flagged)
- no-check-secrets-code-md: aws-creds.py appears when scan is disabled
- chunked-text, chunked-xml: header/footer cycling in non-md formats
- chunked-quiet: multi-chunk quiet mode outputs one path per chunk
- chunked-json: chunks[] array with multiple entries

Filter edge case:
- exclude-multi-md: comma-separated --exclude patterns

New fixture files exercising edge cases:
- it's-fine.md: apostrophe in filename (BSD xargs regression guard)
- my notes.md: space in filename (path handling through pipeline)
- logo.png: 8-byte PNG header (MIME filter rejects in --all mode)
- Symlink (link-to-readme) created in setup_file() to test -L exclusion

All existing golden files regenerated to include the new fixture content.
228 tests, all passing.

Signed-off-by: K1-R1 <77465250+K1-R1@users.noreply.github.com>
Two BSD-vs-GNU portability bugs caused golden file tests to fail on
Linux while passing on macOS:

1. BSD sed does not support -- as end-of-options. It treats -- as a
   literal filename, fails to open it, then processes the real file.
   The || fallback then also runs, doubling file content in XML CDATA
   sections. GNU sed supports -- correctly but the doubled-content
   golden files (generated on macOS) mismatched. Fix: pipe through
   sed via cat instead of using sed -- file.

2. GNU nl adds a trailing newline to files without one; BSD nl does
   not. Combined with the unconditional printf '\n' in
   write_file_entry, this produced an extra blank line on Linux for
   files like no-newline.md. Fix: capture nl output via $() (which
   strips trailing newlines) and output with printf '%s', letting the
   caller's printf '\n' add exactly one.

Regenerated all 9 affected golden files.

Signed-off-by: K1-R1 <77465250+K1-R1@users.noreply.github.com>
@K1-R1 K1-R1 merged commit ca90edc into main Mar 15, 2026
5 checks passed
@K1-R1 K1-R1 deleted the feat/golden-file-acceptance-testing branch March 15, 2026 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant