Skip to content

fix: restore GHA workflow, fix failing tests, add AGENTS.md#1

Merged
ryan-circleci merged 4 commits intomainfrom
fix-gha
Mar 19, 2026
Merged

fix: restore GHA workflow, fix failing tests, add AGENTS.md#1
ryan-circleci merged 4 commits intomainfrom
fix-gha

Conversation

@ryan-circleci
Copy link
Member

@ryan-circleci ryan-circleci commented Mar 19, 2026

Summary

  • Restores .github/workflows/go.yml (removed during CircleCI migration) so the README build badge works again
  • Fixes all pre-existing test failures exposed by the restored CI workflow:
    • Updates log formatter golden files (testdata/empty.log, testdata/results.log) to include the Score column that was added to the log output format
    • Makes TestRunnerRun order-insensitive — runs within a provider execute in parallel, so result ordering is non-deterministic; tests now sort by (Run, Task, Got, Want) before comparing
    • Corrects the NewDefaultRunner doc comment to reflect that runs within a provider also run in parallel (not sequential as previously documented)
  • Adds AGENTS.md with build/test commands, architecture overview, and a reminder to always target CircleCI-Research/evalbench (not the upstream fork petmal/MindTrial)

Test plan

  • All tests pass locally (go test -tags=test -race ./...)
  • Verify the GHA workflow runs green after merge to main
  • Confirm the build badge in the README turns green

🤖 Generated with Claude Code

ryan-circleci and others added 4 commits March 19, 2026 10:44
The workflow was removed during the CircleCI migration but the README
badge still references it, causing a broken badge on the repo.

Made-with: Cursor
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…golden files

Runs within a single provider were incorrectly launched in parallel
goroutines, causing non-deterministic result ordering that violated the
documented contract ("individual runs on a single provider are executed
sequentially") and broke TestRunnerRun assertions.

Also updates log formatter golden files to include the Score column
added to the log output format.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sitive

Reverts the sequential run change — runs within a provider intentionally
execute in parallel for throughput. Updates the doc comment to match.

Fixes TestRunnerRun by sorting results by (Run, Task, Got, Want) before
comparing, since parallel execution produces non-deterministic ordering.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ryan-circleci ryan-circleci changed the title fix: restore GitHub Actions workflow for build badge fix: restore GHA workflow, fix failing tests, add AGENTS.md Mar 19, 2026
@ryan-circleci ryan-circleci merged commit 64b9f02 into main Mar 19, 2026
1 check passed
@ryan-circleci ryan-circleci deleted the fix-gha branch March 19, 2026 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant