GitHub - google/ARC-GEN: A Mimetic Procedural Benchmark Generator for the Abstraction and Reasoning Corpus

This repository contains the source code for ARC-GEN, a mimetic procedural benchmark generator for the Abstraction and Reasoning Corpus.

For a more in-depth description of this work, see the corresponding paper on arxiv.

Installation

$ git clone --recurse-submodules https://github.com/google/ARC-GEN.git && cd ARC-GEN

Usage

For benchmark generation, use the generate command with two arguments: the task number, and the desired number of example pairs.

$ python3 arc_gen.py generate 32 1000
[{'input': [[4, 0, 0, 0], [0, 0, 0, 0], [4, 0, 8, 0], [0, 3, 8, 0]], 'output': ...

For validation (i.e., to ensure that the ARC-GEN generators can collectively reproduce the original ARC-AGI-1 benchmark suite), use the validate command:

$ python3 arc_gen.py validate
A total of 400 generators passed.
A total of 0 generators failed.

For an example of customized variations, refer to arc_gen_variations.py, which produces two variations on Task #125:

  _, generator, _ = task_list.task_list().get(125)
  examples = []
  # Two examples of a "large" variation on Task #125.
  examples.extend([generator(boxes=8, size=28) for _ in range(2)])
  # Two examples of a "large + inverted" variation on Task #125.
  common.set_colors([0, 1, 2, 6, 8, 5, 3, 7, 4, 9])
  examples.extend([generator(boxes=8, size=28) for _ in range(2)])

The ARC-GEN-100K Dataset

For those seeking a pre-generated dataset of sample pairs, the link below provides a static benchmark suite containing 100,000 examples produced by ARC-GEN (covering all four-hundred tasks):

https://www.kaggle.com/datasets/arcgen100k/the-arc-gen-100k-dataset

How to Cite?

@misc{Moffitt2025,
  title={ARC-GEN: A Mimetic Procedural Benchmark Generator for the Abstraction and Reasoning Corpus}, 
  author={Michael D. Moffitt},
  year={2025},
  eprint={2511.00162},
  archivePrefix={arXiv},
  primaryClass={cs.AI},
  url={https://arxiv.org/abs/2511.00162}, 
}

Other Resouces

RE-ARC: Reverse-Engineering the Abstraction and Reasoning Corpus by Michael Hodel
Bootstrapping ARC: Synthetic Problem Generation for ARC Visual Reasoning Tasks by Wen-Ding Li and others

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
external		external
misc		misc
tasks/training		tasks/training
.gitmodules		.gitmodules
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
arc_gen.py		arc_gen.py
arc_gen_variations.py		arc_gen_variations.py
common.py		common.py
task_list.py		task_list.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Installation

Usage

The ARC-GEN-100K Dataset

How to Cite?

Other Resouces

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

google/ARC-GEN

Folders and files

Latest commit

History

Repository files navigation

Installation

Usage

The ARC-GEN-100K Dataset

How to Cite?

Other Resouces

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages