fix(eval): ground LLM judge with command reference to prevent false negatives #342
eval-skill-fork.yml
on: pull_request_target
Reset eval labels
4s
Run skill eval
0s