From 7d893e5a44f2200ba17f4ec0c96d60f87a56235a Mon Sep 17 00:00:00 2001 From: get-colibri Date: Mon, 9 Mar 2026 10:46:02 +0000 Subject: [PATCH 01/15] =?UTF-8?q?=F0=9F=92=BE=20Change=20'README.md'?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 8459259a..0878b807 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # autoresearch -![teaser](progress.png) +⁠![teaser](progress.png) *One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "group meeting". That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that's right or wrong as the "code" is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began. -@karpathy, March 2026*. @@ -8,11 +8,11 @@ The idea: give an AI agent a small but real LLM training setup and let it experi ## How it works -The repo is deliberately kept small and only really has a three files that matter: +The repo is deliberately kept small and only really has three files that matter: -- **`prepare.py`** — fixed constants, one-time data prep (downloads training data, trains a BPE tokenizer), and runtime utilities (dataloader, evaluation). Not modified. -- **`train.py`** — the single file the agent edits. Contains the full GPT model, optimizer (Muon + AdamW), and training loop. Everything is fair game: architecture, hyperparameters, optimizer, batch size, etc. **This file is edited and iterated on by the agent**. -- **`program.md`** — baseline instructions for one agent. Point your agent here and let it go. **This file is edited and iterated on by the human**. +- `prepare.py` — fixed constants, one-time data prep (downloads training data, trains a BPE tokenizer), and runtime utilities (dataloader, evaluation). Not modified. +- `train.py` — the single file the agent edits. Contains the full GPT model, optimizer (Muon + AdamW), and training loop. Everything is fair game: architecture, hyperparameters, optimizer, batch size, etc. **This file is edited and iterated on by the agent**. +- `program.md` — baseline instructions for one agent. Point your agent here and let it go. **This file is edited and iterated on by the human**. By design, training runs for a **fixed 5-minute time budget** (wall clock, excluding startup/compilation), regardless of the details of your compute. The metric is **val_bpb** (validation bits per byte) — lower is better, and vocab-size-independent so architectural changes are fairly compared. @@ -20,6 +20,7 @@ By design, training runs for a **fixed 5-minute time budget** (wall clock, exclu **Requirements:** A single NVIDIA GPU (tested on H100), Python 3.10+, [uv](https://docs.astral.sh/uv/). + ```bash # 1. Install uv project manager (if you don't already have it) @@ -41,6 +42,7 @@ If the above commands all work ok, your setup is working and you can go into aut Simply spin up your Claude/Codex or whatever you want in this repo (and disable all permissions), then you can prompt something like: + ``` Hi have a look at program.md and let's kick off a new experiment! let's do the setup first. ``` @@ -49,6 +51,7 @@ The `program.md` file is essentially a super lightweight "skill". ## Project structure + ``` prepare.py — constants, data prep + runtime utilities (do not modify) train.py — model, optimizer, training loop (agent modifies this) From bb3a28bfb1d19ea46156d4a59b00cb7d69d0f7e5 Mon Sep 17 00:00:00 2001 From: get-colibri Date: Mon, 9 Mar 2026 10:48:23 +0000 Subject: [PATCH 02/15] =?UTF-8?q?=F0=9F=92=BE=20Change=20'README.md'?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 0878b807..b94de069 100644 --- a/README.md +++ b/README.md @@ -2,13 +2,13 @@ ⁠![teaser](progress.png) -*One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "group meeting". That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that's right or wrong as the "code" is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began. -@karpathy, March 2026*. +
​*One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "group meeting". That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that's right or wrong as the "code" is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began. -@karpathy, March 2026*. The idea: give an AI agent a small but real LLM training setup and let it experiment autonomously overnight. It modifies the code, trains for 5 minutes, checks if the result improved, keeps or discards, and repeats. You wake up in the morning to a log of experiments and (hopefully) a better model. The training code here is a simplified single-GPU implementation of [nanochat](https://github.com/karpathy/nanochat). The core idea is that you're not touching any of the Python files like you normally would as a researcher. Instead, you are programming the `program.md` Markdown files that provide context to the AI agents and set up your autonomous research org. The default `program.md` in this repo is intentionally kept as a bare bones baseline, though it's obvious how one would iterate on it over time to find the "research org code" that achieves the fastest research progress, how you'd add more agents to the mix, etc. A bit more context on this project is here in this [tweet](https://x.com/karpathy/status/2029701092347630069). ## How it works -The repo is deliberately kept small and only really has three files that matter: +The repo is deliberately kept small and only really has three files that matter: - `prepare.py` — fixed constants, one-time data prep (downloads training data, trains a BPE tokenizer), and runtime utilities (dataloader, evaluation). Not modified. - `train.py` — the single file the agent edits. Contains the full GPT model, optimizer (Muon + AdamW), and training loop. Everything is fair game: architecture, hyperparameters, optimizer, batch size, etc. **This file is edited and iterated on by the agent**. From f4998d72415378b4c80f169cdb8d71cdf6fba708 Mon Sep 17 00:00:00 2001 From: get-colibri Date: Mon, 9 Mar 2026 10:48:28 +0000 Subject: [PATCH 03/15] =?UTF-8?q?=F0=9F=92=BE=20Change=20'README.md'?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index b94de069..7644f65f 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,8 @@ ⁠![teaser](progress.png) -
​*One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "group meeting". That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that's right or wrong as the "code" is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began. -@karpathy, March 2026*. +*Overview* +*One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "group meeting". That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that's right or wrong as the "code" is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began. -@karpathy, March 2026*. The idea: give an AI agent a small but real LLM training setup and let it experiment autonomously overnight. It modifies the code, trains for 5 minutes, checks if the result improved, keeps or discards, and repeats. You wake up in the morning to a log of experiments and (hopefully) a better model. The training code here is a simplified single-GPU implementation of [nanochat](https://github.com/karpathy/nanochat). The core idea is that you're not touching any of the Python files like you normally would as a researcher. Instead, you are programming the `program.md` Markdown files that provide context to the AI agents and set up your autonomous research org. The default `program.md` in this repo is intentionally kept as a bare bones baseline, though it's obvious how one would iterate on it over time to find the "research org code" that achieves the fastest research progress, how you'd add more agents to the mix, etc. A bit more context on this project is here in this [tweet](https://x.com/karpathy/status/2029701092347630069). From b086a70631482b1de15c8c9effa091601d6318e5 Mon Sep 17 00:00:00 2001 From: get-colibri Date: Mon, 9 Mar 2026 10:48:34 +0000 Subject: [PATCH 04/15] =?UTF-8?q?=F0=9F=92=BE=20Change=20'README.md'?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 7644f65f..2df6adb5 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ ⁠![teaser](progress.png) -*Overview* +## *Overview* *One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "group meeting". That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that's right or wrong as the "code" is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began. -@karpathy, March 2026*. The idea: give an AI agent a small but real LLM training setup and let it experiment autonomously overnight. It modifies the code, trains for 5 minutes, checks if the result improved, keeps or discards, and repeats. You wake up in the morning to a log of experiments and (hopefully) a better model. The training code here is a simplified single-GPU implementation of [nanochat](https://github.com/karpathy/nanochat). The core idea is that you're not touching any of the Python files like you normally would as a researcher. Instead, you are programming the `program.md` Markdown files that provide context to the AI agents and set up your autonomous research org. The default `program.md` in this repo is intentionally kept as a bare bones baseline, though it's obvious how one would iterate on it over time to find the "research org code" that achieves the fastest research progress, how you'd add more agents to the mix, etc. A bit more context on this project is here in this [tweet](https://x.com/karpathy/status/2029701092347630069). From bc8ba6938a1b76659ec9095acb4bba8baa32c694 Mon Sep 17 00:00:00 2001 From: get-colibri Date: Mon, 9 Mar 2026 10:48:39 +0000 Subject: [PATCH 05/15] =?UTF-8?q?=F0=9F=92=BE=20Change=20'README.md'?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 2df6adb5..92cfb24d 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,10 @@ ⁠![teaser](progress.png) -## *Overview* +## *Overview*
​ + +​ + *One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "group meeting". That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that's right or wrong as the "code" is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began. -@karpathy, March 2026*. The idea: give an AI agent a small but real LLM training setup and let it experiment autonomously overnight. It modifies the code, trains for 5 minutes, checks if the result improved, keeps or discards, and repeats. You wake up in the morning to a log of experiments and (hopefully) a better model. The training code here is a simplified single-GPU implementation of [nanochat](https://github.com/karpathy/nanochat). The core idea is that you're not touching any of the Python files like you normally would as a researcher. Instead, you are programming the `program.md` Markdown files that provide context to the AI agents and set up your autonomous research org. The default `program.md` in this repo is intentionally kept as a bare bones baseline, though it's obvious how one would iterate on it over time to find the "research org code" that achieves the fastest research progress, how you'd add more agents to the mix, etc. A bit more context on this project is here in this [tweet](https://x.com/karpathy/status/2029701092347630069). From 6df9f26c79d0450329cc8b95641c21ae41be6c88 Mon Sep 17 00:00:00 2001 From: get-colibri Date: Mon, 9 Mar 2026 10:48:44 +0000 Subject: [PATCH 06/15] =?UTF-8?q?=F0=9F=92=BE=20Change=20'README.md'?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 92cfb24d..e6b33fc0 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ ## *Overview*
​ -​ + autoresearch is an experiment in autonomous AI research. An AI agent iteratively modifies a simple GPT training script, runs 5-minute experiments, and keeps improvements that lower validation loss. You wake up to a log of experiments and (hopefully) a better model. *One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "group meeting". That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that's right or wrong as the "code" is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began. -@karpathy, March 2026*. From 877722511b1733b1d1db02dda1d3a48c4a66e3b3 Mon Sep 17 00:00:00 2001 From: get-colibri Date: Mon, 9 Mar 2026 10:48:49 +0000 Subject: [PATCH 07/15] =?UTF-8?q?=F0=9F=92=BE=20Change=20'README.md'?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index e6b33fc0..8abe18cf 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ ## *Overview*
​ - autoresearch is an experiment in autonomous AI research. An AI agent iteratively modifies a simple GPT training script, runs 5-minute experiments, and keeps improvements that lower validation loss. You wake up to a log of experiments and (hopefully) a better model. +Autoresearch is an experiment in autonomous AI research. An AI agent iteratively modifies a simple GPT training script, runs 5-minute experiments, and keeps improvements that lower validation loss. You wake up to a log of experiments and (hopefully) a better model.
​ *One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "group meeting". That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that's right or wrong as the "code" is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began. -@karpathy, March 2026*. From dd3b504e45f9beeddeb40f9f379b06ece1115fbc Mon Sep 17 00:00:00 2001 From: get-colibri Date: Mon, 9 Mar 2026 10:49:11 +0000 Subject: [PATCH 08/15] =?UTF-8?q?=F0=9F=92=BE=20Change=20'README.md'?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/README.md b/README.md index 8abe18cf..5cc4c01e 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,9 @@ # autoresearch +​ + +​ + ⁠![teaser](progress.png) ## *Overview*
​ From f9dd497019357fc79161ee374d1d8c78e9f7a8b0 Mon Sep 17 00:00:00 2001 From: get-colibri Date: Mon, 9 Mar 2026 10:49:16 +0000 Subject: [PATCH 09/15] =?UTF-8?q?=F0=9F=92=BE=20Change=20'README.md'?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 5cc4c01e..9acc424f 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # autoresearch -​ +## *Overview* ​ From 46522581ac2cdb8ab45eadca1af9e42b22d6c755 Mon Sep 17 00:00:00 2001 From: get-colibri Date: Mon, 9 Mar 2026 10:49:23 +0000 Subject: [PATCH 10/15] =?UTF-8?q?=F0=9F=92=BE=20Change=20'README.md'?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 9acc424f..f949d5c4 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ ## *Overview* -​ +Autoresearch is an experiment in autonomous AI research. An AI agent iteratively modifies a simple GPT training script, runs 5-minute experiments, and keeps improvements that lower validation loss. You wake up to a log of experiments and (hopefully) a better model.
​ ⁠![teaser](progress.png) From 6620afae55a8704ab31cf8492300f7e966327d4c Mon Sep 17 00:00:00 2001 From: get-colibri Date: Mon, 9 Mar 2026 10:49:28 +0000 Subject: [PATCH 11/15] =?UTF-8?q?=F0=9F=92=BE=20Change=20'README.md'?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index f949d5c4..35cc5c6c 100644 --- a/README.md +++ b/README.md @@ -6,9 +6,8 @@ Autoresearch is an experiment in autonomous AI research. An AI agent iteratively ⁠![teaser](progress.png) -## *Overview*
​ - -Autoresearch is an experiment in autonomous AI research. An AI agent iteratively modifies a simple GPT training script, runs 5-minute experiments, and keeps improvements that lower validation loss. You wake up to a log of experiments and (hopefully) a better model.
​ +## *Overview* +
*One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "group meeting". That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that's right or wrong as the "code" is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began. -@karpathy, March 2026*. From d771c6de9bba941992c964c7b064ea78e1a3db5f Mon Sep 17 00:00:00 2001 From: get-colibri Date: Mon, 9 Mar 2026 10:49:38 +0000 Subject: [PATCH 12/15] =?UTF-8?q?=F0=9F=92=BE=20Change=20'README.md'?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/README.md b/README.md index 35cc5c6c..6e6b2d91 100644 --- a/README.md +++ b/README.md @@ -6,8 +6,7 @@ Autoresearch is an experiment in autonomous AI research. An AI agent iteratively ⁠![teaser](progress.png) -## *Overview* -
+##

*One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "group meeting". That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that's right or wrong as the "code" is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began. -@karpathy, March 2026*. From be8e80037cca94096dda4d42166f6600fb573303 Mon Sep 17 00:00:00 2001 From: get-colibri Date: Mon, 9 Mar 2026 10:49:43 +0000 Subject: [PATCH 13/15] =?UTF-8?q?=F0=9F=92=BE=20Change=20'README.md'?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/README.md b/README.md index 6e6b2d91..85e528f2 100644 --- a/README.md +++ b/README.md @@ -4,9 +4,7 @@ Autoresearch is an experiment in autonomous AI research. An AI agent iteratively modifies a simple GPT training script, runs 5-minute experiments, and keeps improvements that lower validation loss. You wake up to a log of experiments and (hopefully) a better model.
​ -⁠![teaser](progress.png) - -##

+## ⁠![teaser](progress.png)
​ *One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "group meeting". That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that's right or wrong as the "code" is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began. -@karpathy, March 2026*. From ae399b9eec9c1fe876c0470ac09d54efc861e649 Mon Sep 17 00:00:00 2001 From: get-colibri Date: Mon, 9 Mar 2026 10:49:48 +0000 Subject: [PATCH 14/15] =?UTF-8?q?=F0=9F=92=BE=20Change=20'README.md'?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/README.md b/README.md index 85e528f2..68820c50 100644 --- a/README.md +++ b/README.md @@ -4,9 +4,7 @@ Autoresearch is an experiment in autonomous AI research. An AI agent iteratively modifies a simple GPT training script, runs 5-minute experiments, and keeps improvements that lower validation loss. You wake up to a log of experiments and (hopefully) a better model.
​ -## ⁠![teaser](progress.png)
​ - -*One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "group meeting". That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that's right or wrong as the "code" is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began. -@karpathy, March 2026*. +⁠![teaser](progress.png)*One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "group meeting". That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that's right or wrong as the "code" is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began. -@karpathy, March 2026*. The idea: give an AI agent a small but real LLM training setup and let it experiment autonomously overnight. It modifies the code, trains for 5 minutes, checks if the result improved, keeps or discards, and repeats. You wake up in the morning to a log of experiments and (hopefully) a better model. The training code here is a simplified single-GPU implementation of [nanochat](https://github.com/karpathy/nanochat). The core idea is that you're not touching any of the Python files like you normally would as a researcher. Instead, you are programming the `program.md` Markdown files that provide context to the AI agents and set up your autonomous research org. The default `program.md` in this repo is intentionally kept as a bare bones baseline, though it's obvious how one would iterate on it over time to find the "research org code" that achieves the fastest research progress, how you'd add more agents to the mix, etc. A bit more context on this project is here in this [tweet](https://x.com/karpathy/status/2029701092347630069). From 73ea6230d18565357d0d1632f05edc3a68250f1f Mon Sep 17 00:00:00 2001 From: get-colibri Date: Mon, 9 Mar 2026 10:49:54 +0000 Subject: [PATCH 15/15] =?UTF-8?q?=F0=9F=92=BE=20Change=20'README.md'?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 68820c50..e285728d 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,8 @@ Autoresearch is an experiment in autonomous AI research. An AI agent iteratively modifies a simple GPT training script, runs 5-minute experiments, and keeps improvements that lower validation loss. You wake up to a log of experiments and (hopefully) a better model.
​ -⁠![teaser](progress.png)*One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "group meeting". That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that's right or wrong as the "code" is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began. -@karpathy, March 2026*. +⁠![teaser](progress.png) +*One day, frontier AI research used to be done by meat computers in between eating, sleeping, having other fun, and synchronizing once in a while using sound wave interconnect in the ritual of "group meeting". That era is long gone. Research is now entirely the domain of autonomous swarms of AI agents running across compute cluster megastructures in the skies. The agents claim that we are now in the 10,205th generation of the code base, in any case no one could tell if that's right or wrong as the "code" is now a self-modifying binary that has grown beyond human comprehension. This repo is the story of how it all began. -@karpathy, March 2026*. The idea: give an AI agent a small but real LLM training setup and let it experiment autonomously overnight. It modifies the code, trains for 5 minutes, checks if the result improved, keeps or discards, and repeats. You wake up in the morning to a log of experiments and (hopefully) a better model. The training code here is a simplified single-GPU implementation of [nanochat](https://github.com/karpathy/nanochat). The core idea is that you're not touching any of the Python files like you normally would as a researcher. Instead, you are programming the `program.md` Markdown files that provide context to the AI agents and set up your autonomous research org. The default `program.md` in this repo is intentionally kept as a bare bones baseline, though it's obvious how one would iterate on it over time to find the "research org code" that achieves the fastest research progress, how you'd add more agents to the mix, etc. A bit more context on this project is here in this [tweet](https://x.com/karpathy/status/2029701092347630069).