[Catalog]: DATASET & Determine how to store catalog - externally versioned repo?

### Initial discussion

- Games catalog could live outside gambit repo in a separate repo that generates the dataset
- Should have its own versioning
- Could be a submodule in Gambit
- Stick data releases on ~~zenodo with a DOI~~ Harvard Dataverse or HuggingFace
- Data should output in [croissant format](https://mlcommons.org/working-groups/data/croissant/) or be convertable to that via an external utility script
- Gambit catalog could pull games from e.g. `v1.1.x` of the catalog e.g. the smallest version being for typos
- **Could be that the catalog does not ship with gambit, there is a download function which caches it**, then that's versioned
- Perhaps this is an issue for gambit 17, that version uses published datasets
- Ultimately all games will be files and not code

### NeurIPS Evaluation and Benchmarks

NeurIPS Evaluation and Benchmarks might have a specific platform to use:
- https://blog.neurips.cc/2026/03/23/introducing-the-evaluations-datasets-track-at-neurips-2026/
- https://neurips.cc/Conferences/2026/CallForEvaluationsDatasets

> ...authors should clearly explain what claims the dataset is intended to support (e.g., improved model performance, fairness, robustness, safety, or other model characteristics), under what assumptions those claims are valid, and what limitations constrain them. 

>  data-centric and benchmarking submissions historically welcomed by the track remain fully in scope. These include, but are not limited to: New datasets and dataset collections

> We strongly encourage all authors to release code whenever feasible to promote transparency and reproducibility. However, code release is required at submission when the primary contribution is a reusable executable artifact, such as a benchmark suite, evaluation environment, data generator, or software tool, whose functionality must be inspected in order to evaluate the scientific claims.

### Ed's thoughts on where/how to host the catalog

- I think we should use HuggingFace to maximise visibility to ML community, no reason we can't also put on Harvard Dataverse for academic visibility
- HuggingFace will be a good place from which to consume the data, if the Gambit catalog module pulls from an external repo: https://huggingface.co/datasets
- Hosting: HuggingFace has **928K** datasets, Harvard has **295K** (and 8K "dataverses")
-  Seems pretty easy to use the croissant package to generate metadata: https://github.com/mlcommons/croissant although there is a specific way they want us to generate it:

> The dataset hosting process as part of submitting to the Evaluations and Datasets Track involves:
> 1. Choosing among 4 options to host your dataset: Harvard Dataverse, Kaggle, Hugging Face, and OpenML
> 2. Using platform tooling to download the automatically generated Croissant file
> 3. Complete the Croissant file with Responsible AI (RAI) metadata. We aim to provide additional tooling for this
> 4. Including a URL to your dataset and uploading the generated Croissant file in OpenReview
> 5. If your submission is accepted: making your dataset public by the camera ready deadline

### Questions

- [ ] What is the format of our data? Just the set of EFG/NFG files?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Catalog]: DATASET & Determine how to store catalog - externally versioned repo? #746

Initial discussion

NeurIPS Evaluation and Benchmarks

Ed's thoughts on where/how to host the catalog

Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Catalog]: DATASET & Determine how to store catalog - externally versioned repo? #746

Description

Initial discussion

NeurIPS Evaluation and Benchmarks

Ed's thoughts on where/how to host the catalog

Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions