autoresearch-vit

Autonomous vision research based on Karpathy's original autoresearch repo for LLM pretraining, adapted here to CIFAR-100 image classification on a single NVIDIA GTX 1080 Ti.

The repo is intentionally tiny. An agent edits one file, trains for a fixed 5-minute budget, checks whether validation top-1 improved on CIFAR-100, and keeps or discards the experiment.

This snapshot is rendered from the local results.tsv, excludes the Axis-Orbit exploration runs, and labels only the runs that improved the best-so-far result.

How it works

Only four files matter for day-to-day autonomous runs:

prepare.py - fixed data prep, cached tensors, dataloaders, and evaluation. Do not modify during autonomous runs.
train.py - the single file the agent edits. Model architecture, optimizer, augmentations, and training loop all live here.
program.md - the human-written instructions that define the autonomous research loop.
run.md - the human-controlled stop flag. True allows the next run to start, False stops after the current run is logged.

By design, training runs for a fixed 300 second wall-clock budget, regardless of what the agent changes. The metric is validation top-1 accuracy on the full CIFAR-100 validation split, so higher is better.

Quick start

Requirements:

NVIDIA GPU with CUDA support
Python 3.11+
uv

On this Windows machine the default python is 3.8, so prefer explicit 3.11 commands:

# install uv if needed
py -3.11 -m pip install uv

# install dependencies with Python 3.11
uv sync --python 3.11

# download and cache CIFAR-100
uv run --python 3.11 prepare.py

# run one baseline experiment
uv run --python 3.11 train.py

If those commands work, the repo is ready for autonomous experimentation.

Project structure

prepare.py      fixed data prep, loaders, evaluation
train.py        editable model and training loop
program.md      autonomous research instructions
run.md          stop-after-current-run control file
render_results_graph.py  snapshot chart helper
pyproject.toml  dependencies
assets/         checked-in README chart

Design choices

Single editable file. The agent only touches train.py.
Fixed time budget. Runs optimize for what can be learned in 5 minutes on this exact GPU.
Fixed evaluation harness. prepare.py owns the validation metric and cached dataset.
Minimal dependencies. Only PyTorch and torchvision are required.

This fork keeps the core autoresearch structure from the LLM version: one mutable training file, a fixed preparation and evaluation harness, and a fixed time budget. The main swap is the task itself, from language-model pretraining to CIFAR-100 image classification.

Stopping autonomous runs

run.md must contain exactly one control value on its first non-empty line:

True means the agent may start another run.
False means the agent should finish the current run, log it, and then stop before starting the next one.

If run.md is missing or contains anything else, the agent should fail closed and stop instead of guessing.

To regenerate the chart from the current local results.tsv:

.venv\Scripts\python.exe render_results_graph.py

Notes for smaller or different hardware

This fork is scoped to a single CUDA GPU and tuned for the local GTX 1080 Ti. If you move to a different GPU, the first knobs to revisit in train.py are:

DEVICE_BATCH_SIZE
TOTAL_BATCH_SIZE
EMBED_DIM, DEPTH, and NUM_HEADS
learning rate and weight decay

The first version is ViT-first. train.py includes a MODEL_FAMILY switch so CNN baselines can be added later without changing the repo layout.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

autoresearch-vit

How it works

Quick start

Project structure

Design choices

Stopping autonomous runs

Notes for smaller or different hardware

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
.gitignore		.gitignore
README.md		README.md
prepare.py		prepare.py
program.md		program.md
pyproject.toml		pyproject.toml
render_results_graph.py		render_results_graph.py
run.md		run.md
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

autoresearch-vit

How it works

Quick start

Project structure

Design choices

Stopping autonomous runs

Notes for smaller or different hardware

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages