fix(eval): ground LLM judge with command reference to prevent false negatives · getsentry/cli@5977323

Patch coverage defaulted to 100% because no changed files matched coverage data. Unmatched diff files: .github/workflows/ci.yml, .github/workflows/eval-skill-fork.yml, AGENTS.md, script/eval-skill.ts, test/e2e/skill-eval.test.ts, test/skill-eval/helpers/judge.ts, test/skill-eval/helpers/verify.ts Sample coverage paths: src/app.ts, src/cli.ts, src/commands/api.ts This usually indicates a path format mismatch between your coverage tool and the repository.

Artifacts

Produced during runtime

Name	Size	Digest
codecov-coverage-results-fix-eval-skill-judge-context-test-unit	169 KB	`sha256:2ecc97df550dd5a3d75f91352ade6c84cdd5ad233c16a5b3b0bbaddf87038226`
codecov-test-results-fix-eval-skill-judge-context-test-unit	227 Bytes	`sha256:dee043434cf08fc52116b8ada461b35f1224f6b9b9fa85ccfc4e377518d07795`
gh-pages	1.79 MB	`sha256:f44e66c4b8a8eea6d1bc8a24e20b9d2f6d8666142c2f0c46f128dec9bc6e7992`
npm-package	893 KB	`sha256:8d968444a4271ae7169a6d8076deb2b5ea5a718127b2397aefabf7bc3f96536f`
sentry-linux-x64	30.1 MB	`sha256:140ea717e003202386d21b6ee1d09766ff416ecaeac76233d31f4927aeb979e5`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(eval): ground LLM judge with command reference to prevent false negatives #3283

Summary

fix(eval): ground LLM judge with command reference to prevent false negatives #3283

Uh oh!

ci.yml

Annotations

Artifacts