1000 Go tasks generated from 20 open-source GitHub repos using SWE-gen.
- is a merged GitHub PR with 2-10 source files edited
- has Fail-to-Pass unit tests
- passes NOP (baseline fails) and Oracle (fix succeeds) validation
Install Harbor:
uv tool install harborRun with Codex:
export OPENAI_API_KEY=<YOUR-KEY>
harbor run --dataset swe-gen-go \
--agent codex \
--model openai/gpt-5.2-codex \
--n-concurrent 4This command automatically downloads the tasks.
