Skip to content

Latest commit

 

History

History
86 lines (62 loc) · 3.02 KB

File metadata and controls

86 lines (62 loc) · 3.02 KB

models reference

vision models (for labeling)

used in the label skill to detect objects in game screenshots and return bounding boxes.

model input cost output cost speed quality notes
gpt-5-nano cheapest cheapest fastest good default — best for high-volume labeling
gpt-5-mini low low fast better reasoning model, slight thinking overhead
gpt-5.3-codex-spark low low 1000+ tok/s good optimized for codex CLI, ideal for rapid labeling loops
gpt-4.1-nano $0.10/1M $0.40/1M very fast good non-reasoning, excellent for structured output
gpt-4.1-mini $0.40/1M $1.60/1M fast very good best accuracy/cost balance
gpt-5.4 $2.00/1M $12.00/1M medium best frontier reasoning + native computer use
gpt-4o $2.50/1M $10.00/1M medium excellent legacy

how to change

edit config.json:

{
  "model": "gpt-4.1-mini"
}

structured outputs

the label skill uses the responses API with text=RESPONSE_SCHEMA (strict JSON schema enforcement). this means:

  • the model is guaranteed to return valid JSON matching our bounding box schema
  • no more regex fallback parsing needed
  • works with all models listed above

cost estimate

for a 5-minute gameplay video at 1 fps = ~300 frames:

model est. cost (300 frames)
gpt-5-nano ~$0.10-0.30
gpt-5.3-codex-spark ~$0.15-0.35
gpt-4.1-nano ~$0.15-0.40
gpt-4.1-mini ~$0.50-1.50
gpt-5.4 ~$3-8
gpt-4o ~$5-15

costs depend on image resolution and number of detected objects.

batch API (for large jobs)

openai's batch API gives 50% off with 24-hour async processing. good for re-labeling large datasets where speed isn't critical. not currently integrated but could be added to the label skill.

YOLO models (for training)

used in the train skill as the base model for transfer learning.

model size mAP@50 (coco) speed use case
yolov8n.pt 3.2M params 37.3 fastest default — quick iteration, small datasets
yolov8s.pt 11.2M params 44.9 fast better accuracy, still fast
yolov8m.pt 25.9M params 50.2 medium good balance for larger datasets
yolov8l.pt 43.7M params 52.9 slow high accuracy, needs GPU
yolov8x.pt 68.2M params 53.9 slowest maximum accuracy

how to change

{
  "yolo_model": "yolov8s.pt"
}

training config

param default what it does
epochs 50 training iterations over the full dataset
train_split 0.8 80% train / 20% validation
imgsz 640 image size for training (set in train skill)

when to use larger models

  • yolov8n (default): <500 training images, quick experiments
  • yolov8s: 500-2000 images, production use on fast hardware
  • yolov8m+: 2000+ images, GPU available, accuracy is priority