codex-native yolo dataset + training agent skills for any unlabeled data. codex skills help determine how to properly label and splice your data.
flowchart LR
A["data source (any unlabeled data)"] --> B["collect"]
B --> C["label"]
C --> D["augment"]
D --> E["train"]
E --> F["eval"]
C -. "parallel subagents in git worktrees" .-> C
F --> G{"meets target?"}
G -- "no" --> C
G -- "yes" --> H["complete"]
- skills runtime: codex skills
- pipeline skills:
collect,label,augment,train,eval,yolodex - orchestration loop:
yolodex.sh+AGENTS.md - shared helpers:
shared/utils.py
git clone https://github.com/qtzx06/yolodex && cd yolodex
bash setup.shrequirements:
- macos, linux, or windows
- python 3.11+
- codex cli (latest — supports forked skill contexts and per-skill model overrides)
- start codex in repo root
- use the
yolodexskill to gather config - run labeling with subagents when needed
- iterate until eval target is met
example:
$ codex
> use the yolodex skill to train a detector from this video: https://youtube.com/...
> classes: player, weapon, vehicle
codex determines how many subagents to spawn for labeling based on the dataset size:
bash .agents/skills/label/scripts/dispatch.sh <n>autonomous loop:
bash yolodex.shmanual skills:
uv run .agents/skills/collect/scripts/run.py
bash .agents/skills/label/scripts/dispatch.sh 4
uv run .agents/skills/augment/scripts/run.py
uv run .agents/skills/train/scripts/run.py
uv run .agents/skills/eval/scripts/run.pyminimal config.json:
{
"project": "my-project",
"video_url": "https://youtube.com/watch?v=YOUR_VIDEO",
"classes": ["player", "weapon", "vehicle"],
"label_mode": "codex",
"target_accuracy": 0.75,
"num_agents": 4,
"fps": 1,
"yolo_model": "yolov8n.pt",
"epochs": 50,
"train_split": 0.8,
"seed": 42
}key fields:
| field | default | description |
|---|---|---|
project |
"" |
output namespace under runs/<project>/ |
video_url |
"" |
youtube url or local video path |
classes |
[] |
target classes |
label_mode |
"codex" |
label strategy |
target_accuracy |
0.75 |
mAP@50 stop threshold |
num_agents |
4 |
parallel label workers |
fps |
1 |
extraction fps |
yolo_model |
"yolov8n.pt" |
base yolo checkpoint |
epochs |
50 |
train epochs |
train_split |
0.8 |
train/val split |
seed |
42 |
deterministic split seed |
- eval:
runs/<project>/eval_results.json - frames:
runs/<project>/frames/ - label previews:
runs/<project>/frames/preview/ - trained weights:
runs/<project>/weights/best.pt
yolodex/
├── .agents/skills/
│ ├── yolodex/
│ ├── collect/
│ ├── label/
│ ├── augment/
│ ├── train/
│ └── eval/
├── shared/utils.py
├── pipeline/main.py
├── docs/
├── yolodex.sh
├── setup.sh
├── AGENTS.md
├── config.json
├── progress.txt
└── pyproject.toml
contributions are welcome.
standard flow:
- fork the repo and create a branch from
main - keep changes scoped and explain the why in your pr
- run relevant checks before opening the pr
- open a pull request with:
- clear summary
- test/validation notes
- screenshots/log snippets when relevant
if you’re planning a bigger change, open an issue first so we can align on direction.
mit
made with love at openai ♡