Skip to content

fix(eval): ground LLM judge with command reference to prevent false negatives #339

fix(eval): ground LLM judge with command reference to prevent false negatives

fix(eval): ground LLM judge with command reference to prevent false negatives #339

Triggered via pull request April 10, 2026 10:28
@BYKBYK
synchronize #712
Status Success
Total duration 5s
Artifacts

eval-skill-fork.yml

on: pull_request_target
Reset eval labels
2s
Reset eval labels
Run skill eval
0s
Run skill eval
Fit to window
Zoom out
Zoom in