fix(eval): ground LLM judge with command reference to prevent false negatives #3284
ci.yml
on: pull_request
Annotations
1 warning
|
Unit Tests
Patch coverage defaulted to 100% because no changed files matched coverage data.
Unmatched diff files: .github/workflows/ci.yml, .github/workflows/eval-skill-fork.yml, AGENTS.md, script/eval-skill.ts, test/e2e/skill-eval.test.ts, test/skill-eval/helpers/judge.ts, test/skill-eval/helpers/verify.ts
Sample coverage paths: src/app.ts, src/cli.ts, src/commands/api.ts
This usually indicates a path format mismatch between your coverage tool and the repository.
|
Artifacts
Produced during runtime
| Name | Size | Digest | |
|---|---|---|---|
|
codecov-coverage-results-fix-eval-skill-judge-context-test-unit
|
169 KB |
sha256:cfbb478af75a3b66aff7d69bccfcb87ba0704144987ce1ab2c5b0850385f7983
|
|
|
codecov-test-results-fix-eval-skill-judge-context-test-unit
|
227 Bytes |
sha256:e4f0731f66ca0b7832b47206c64fba646899e25c1853da04d144824fef5f018b
|
|
|
gh-pages
|
1.79 MB |
sha256:0440bdd878467fd3b4735690e0e3ab2f4ab7f8e17facc6b6f27c0d0b89ed0eff
|
|
|
npm-package
|
893 KB |
sha256:c1ba028581da678f6fd713984e3911c7332bbb1c82ced1a3b54b47cc38c95f3f
|
|
|
sentry-linux-x64
|
30.1 MB |
sha256:80adcce20ba40d248f779ae15152e4332f61d0d4f31ea16c90e197a83d4eb16f
|
|