Skip to content

fix: stabilize E2E tests against Bun pipe buffering and LLM non-determinism#605

Merged
FL4TLiN3 merged 3 commits intomainfrom
fix/e2e-test-stabilization
Feb 23, 2026
Merged

fix: stabilize E2E tests against Bun pipe buffering and LLM non-determinism#605
FL4TLiN3 merged 3 commits intomainfrom
fix/e2e-test-stabilization

Conversation

@FL4TLiN3
Copy link
Contributor

Summary

  • Fix stdout capture in E2E test runner by redirecting to temp file instead of pipe, working around Bun's pipe buffering that truncated large outputs (6.3MB → 1.5MB)
  • Rewrite NDJSON event parser with boundary grouping to handle base64 data containing literal newlines inside JSON strings
  • Relax event sequence assertions in PDF/image tests to tolerate LLM batching tools in fewer turns
  • Fix dynamic skills test: use local bun ./apps/base/dist/bin/server.js instead of npx @perstack/base (which failed with "could not determine executable to run" since addSkill bypasses the bundled base optimization)
  • Add runCliUntilToolCalled retry helper for tests sensitive to LLM non-determinism
  • Strengthen expert instructions in multi-modal and skills TOML configs
  • Add explicit preservation requirement for existing experts in create-expert definition-writer

Test plan

  • run.test.ts — 4/4 pass, 3 consecutive green runs
  • skills.test.ts — 6/6 pass, 3 consecutive green runs
  • create-expert.test.ts — 2/2 pass, 3 consecutive green runs

🤖 Generated with Claude Code

FL4TLiN3 and others added 3 commits February 23, 2026 15:34
…minism

Fix stdout capture in test runner by redirecting to temp file instead of
pipe, working around Bun's pipe buffering that truncated large outputs
(6.3MB → 1.5MB). Rewrite event parser with boundary grouping to handle
base64 data containing literal newlines. Relax event sequence assertions
in PDF/image tests, strengthen expert instructions, and fix dynamic
skills test to use local binary instead of npx (which failed to resolve
the monorepo package).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@FL4TLiN3 FL4TLiN3 merged commit 46e511e into main Feb 23, 2026
11 checks passed
@FL4TLiN3 FL4TLiN3 mentioned this pull request Feb 23, 2026
@FL4TLiN3 FL4TLiN3 deleted the fix/e2e-test-stabilization branch February 25, 2026 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant