test: mutation hardening cycle — 72.19% to 78.14% (+5.95pp) by cmbays · Pull Request #386 · cmbays/kata

cmbays · 2026-03-16T13:45:58Z

Summary

Mutation score: 72.19% -> 78.14% (+5.95 percentage points)
Added 5 test files to the mutation test group (vitest.test-groups.ts)
Wrote targeted tests to kill 64+ surviving mutants across 5 hotspot files
Zero new production code — all changes are test additions

Per-file improvements

File	Before	After	Delta
session-bridge.ts	74.68%	84.09%	+9.41pp
cooldown-session.ts	65.70%	74.88%	+9.18pp
execute.ts	69.21%	72.69%	+3.48pp
observability-aggregator.ts	96.43%	100.00%	+3.57pp
kata-agent (overall)	92.68%	95.12%	+2.44pp
workflow-runner.ts	83.91%	83.91%	+0.00pp

Key changes

vitest.test-groups.ts: Added unit and helper test files to mutation test group
session-bridge.unit.test.ts: 15 new tests targeting budget estimation, named kata stage resolution, prepareCycle dedup guards, formatDuration, readBridgeRunMeta null path, updateRunJson guards, completeCycle filter
cooldown-session.unit.test.ts: 8 new tests targeting diary guard, betDescription fallback, force defaults, observation collection, autoSync filtering, listJsonFiles filter
execute.test.ts: 10 new tests targeting plain-text context output, pipeline print content, gyo trim, cycle state filter, yolo flag, cycle completion plain-text, no-token output
workflow-runner.test.ts: 2 new tests for listRecentArtifacts non-json filtering and persistArtifact dir creation

Remaining survivors (diminishing returns)

execute.ts: Mostly StringLiteral description mutations (~80 of 107)
cooldown-session.ts: Catch-block logger.warn NoCoverage (19)
session-bridge.ts: existsSync early-return guards (22)
kata-agent-confidence-calculator.ts: 2 BooleanLiteral on recursive mkdir

Test plan

npm run typecheck passes
npm run lint passes
npm test (unit + integration) — 3284+ tests pass
npm run test:mutation — 78.14% (above 70% break threshold)

Summary by CodeRabbit

Tests
- Expanded test coverage for CLI command execution, cycle management, workflow execution, and session management
- Added tests for plain-text output rendering, whitespace handling, artifact filtering, and edge case scenarios
- Strengthened regression safeguards across core features

…Cycle guards) Add targeted tests for estimateBudgetUsage, countJsonlLines, countRunData, prepareCycle dedup and runId sync, resolveStages named kata, writeCycleNameIfChanged, formatDuration, updateCycleState no-op, and collectCycleCompletionTotals. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The mutation test config was only running integration tests, missing the unit and helpers test files that provide direct coverage of extracted pure functions. Adding them raises visibility for Stryker. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…c, observation filter) Target enrichBetOutcomesWithDescriptions fallback, force=false defaults, collectSynthesisObservations skip logic, autoSyncBetOutcomes filtering, and listJsonFiles non-json exclusion. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ontent, gyo trim) Add plain-text context output test, single-stage and pipeline print content assertions, gyo whitespace trimming test, and listSavedKatas non-json filter test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…nJson guards, completeCycle filter) Target existsSync early-return guards in updateRunJsonOnComplete, updateRunJsonAgentAttribution, readBridgeRunMeta, and the null filter in collectCycleCompletionTotals. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…gory override) Add tests for active-vs-planning cycle selection, cycle name fallback, explicit category override with --next, and yolo flag propagation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…sistArtifact dir creation) Add tests for non-json file filtering in listRecentArtifacts and artifacts directory creation on demand in persistArtifact. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Kill NoCoverage mutants in the cycle --complete plain-text output path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…oken output) Add test for complete subcommand plain-text output without token usage, verifying the no-token output path is exercised. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector · 2026-03-16T13:46:03Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

coderabbitai · 2026-03-16T13:46:15Z

Caution

Review failed

Pull request was closed or merged during review

📝 Walkthrough

Walkthrough

This PR adds extensive test coverage across multiple test files without modifying any production code or public APIs. Tests cover plain-text rendering, cycle management, artifact filtering, JSON file handling, and various edge cases in execute, cooldown-session, workflow-runner, and session-bridge features. Configuration file updated to include new test files in mutation testing.

Changes

Cohort / File(s)	Summary
Execute Command Tests `src/cli/commands/execute.test.ts`	Adds tests for plain-text rendering (cycle completion, context, pipeline stages), yolo flag behavior, next-subcommand resolution, cycle selection logic, whitespace handling, and non-JSON file filtering in kata listings.
Cooldown Session Tests `src/features/cycle-management/cooldown-session.unit.test.ts`	Adds comprehensive tests for bet outcome enrichment, run/prepare warnings, synthesis observation collection, bridge-run synchronization, JSON file filtering, and various edge cases around incomplete runs and diary writing.
Workflow Runner Tests `src/features/execute/workflow-runner.test.ts`	Adds tests for JSON artifact filtering in listRecentArtifacts and dynamic artifacts directory creation during persistArtifact operations.
Session Bridge Tests `src/infrastructure/execution/session-bridge.unit.test.ts`	Adds tests for budget estimation, cycle preparation deduplication, stage resolution, bridge metadata handling, cycle completion aggregation, and error/warning scenarios with missing or invalid files.
Test Configuration `vitest.test-groups.ts`	Updates mutationTestFiles export to include five new test file paths (execute.helpers.test.ts, cooldown-session unit and helpers tests, session-bridge unit and helpers tests).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

PR #91: Overlaps with test additions to execute.test.ts and WorkflowRunner artifact-listing/persisting behaviors.
PR #383: Directly related through test coverage of execute helpers, execute orchestration, and cooldown-session functionality changes.
PR #382: Shares focus on session-bridge cycle-completion logic and verification of cycle completion totals.

Poem

🐰 Tests bloom like carrots in spring's gentle light,
Each cycle, each bet, now covered just right,
From plaintext to JSON, from run to digest,
We hop through the assertions—each one's a zest! ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main objective: a mutation testing hardening effort that increased coverage from 72.19% to 78.14%, with all changes being test additions across multiple files.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch worktree-rosy-twirling-petal

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

cmbays and others added 9 commits March 16, 2026 08:06

test: add plain-text cycle completion output test

1966c63

Kill NoCoverage mutants in the cycle --complete plain-text output path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

test: kill execute.ts NoCoverage survivors (complete plain-text, no-t…

ae43cda

…oken output) Add test for complete subcommand plain-text output without token usage, verifying the no-token output path is exercised. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cmbays merged commit 05bd284 into main Mar 16, 2026
2 of 3 checks passed

cmbays deleted the worktree-rosy-twirling-petal branch March 16, 2026 13:50

coderabbitai bot mentioned this pull request Mar 16, 2026

test: mutation hardening cycle — 79.91% to 84.77% (+4.86pp) #388

Merged

4 tasks

cmbays mentioned this pull request Mar 16, 2026

test: final mutation hardening — 84.77% to 90.94% (+6.17pp) #389

Merged

6 tasks

coderabbitai bot mentioned this pull request Mar 17, 2026

refactor: staff engineer cleanup — inline thin wrappers, colocate tests #391

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: mutation hardening cycle — 72.19% to 78.14% (+5.95pp)#386

test: mutation hardening cycle — 72.19% to 78.14% (+5.95pp)#386
cmbays merged 9 commits intomainfrom
worktree-rosy-twirling-petal

cmbays commented Mar 16, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

chatgpt-codex-connector bot commented Mar 16, 2026

Uh oh!

coderabbitai bot commented Mar 16, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cmbays commented Mar 16, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Per-file improvements

Key changes

Remaining survivors (diminishing returns)

Test plan

Summary by CodeRabbit

Uh oh!

chatgpt-codex-connector bot commented Mar 16, 2026

Uh oh!

coderabbitai bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cmbays commented Mar 16, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 16, 2026 •

edited

Loading