models reference

vision models (for labeling)

used in the label skill to detect objects in game screenshots and return bounding boxes.

model	input cost	output cost	speed	quality	notes
gpt-5-nano	cheapest	cheapest	fastest	good	default — best for high-volume labeling
gpt-5-mini	low	low	fast	better	reasoning model, slight thinking overhead
gpt-5.3-codex-spark	low	low	1000+ tok/s	good	optimized for codex CLI, ideal for rapid labeling loops
gpt-4.1-nano	$0.10/1M	$0.40/1M	very fast	good	non-reasoning, excellent for structured output
gpt-4.1-mini	$0.40/1M	$1.60/1M	fast	very good	best accuracy/cost balance
gpt-5.4	$2.00/1M	$12.00/1M	medium	best	frontier reasoning + native computer use
gpt-4o	$2.50/1M	$10.00/1M	medium	excellent	legacy

how to change

edit config.json:

{
  "model": "gpt-4.1-mini"
}

structured outputs

the label skill uses the responses API with text=RESPONSE_SCHEMA (strict JSON schema enforcement). this means:

the model is guaranteed to return valid JSON matching our bounding box schema
no more regex fallback parsing needed
works with all models listed above

cost estimate

for a 5-minute gameplay video at 1 fps = ~300 frames:

model	est. cost (300 frames)
gpt-5-nano	~$0.10-0.30
gpt-5.3-codex-spark	~$0.15-0.35
gpt-4.1-nano	~$0.15-0.40
gpt-4.1-mini	~$0.50-1.50
gpt-5.4	~$3-8
gpt-4o	~$5-15

costs depend on image resolution and number of detected objects.

batch API (for large jobs)

openai's batch API gives 50% off with 24-hour async processing. good for re-labeling large datasets where speed isn't critical. not currently integrated but could be added to the label skill.

YOLO models (for training)

used in the train skill as the base model for transfer learning.

model	size	mAP@50 (coco)	speed	use case
yolov8n.pt	3.2M params	37.3	fastest	default — quick iteration, small datasets
yolov8s.pt	11.2M params	44.9	fast	better accuracy, still fast
yolov8m.pt	25.9M params	50.2	medium	good balance for larger datasets
yolov8l.pt	43.7M params	52.9	slow	high accuracy, needs GPU
yolov8x.pt	68.2M params	53.9	slowest	maximum accuracy

how to change

{
  "yolo_model": "yolov8s.pt"
}

training config

param	default	what it does
epochs	50	training iterations over the full dataset
train_split	0.8	80% train / 20% validation
imgsz	640	image size for training (set in train skill)

when to use larger models

yolov8n (default): <500 training images, quick experiments
yolov8s: 500-2000 images, production use on fast hardware
yolov8m+: 2000+ images, GPU available, accuracy is priority

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

models reference

vision models (for labeling)

how to change

structured outputs

cost estimate

batch API (for large jobs)

YOLO models (for training)

how to change

training config

when to use larger models

FilesExpand file tree

models.md

Latest commit

History

models.md

File metadata and controls

models reference

vision models (for labeling)

how to change

structured outputs

cost estimate

batch API (for large jobs)

YOLO models (for training)

how to change

training config

when to use larger models