-
Notifications
You must be signed in to change notification settings - Fork 254
[Catalog]: DATASET & Determine how to store catalog - externally versioned repo? #746
Description
Initial discussion
- Games catalog could live outside gambit repo in a separate repo that generates the dataset
- Should have its own versioning
- Could be a submodule in Gambit
- Stick data releases on
zenodo with a DOIHarvard Dataverse or HuggingFace - Data should output in croissant format or be convertable to that via an external utility script
- Gambit catalog could pull games from e.g.
v1.1.xof the catalog e.g. the smallest version being for typos - Could be that the catalog does not ship with gambit, there is a download function which caches it, then that's versioned
- Perhaps this is an issue for gambit 17, that version uses published datasets
- Ultimately all games will be files and not code
NeurIPS Evaluation and Benchmarks
NeurIPS Evaluation and Benchmarks might have a specific platform to use:
- https://blog.neurips.cc/2026/03/23/introducing-the-evaluations-datasets-track-at-neurips-2026/
- https://neurips.cc/Conferences/2026/CallForEvaluationsDatasets
...authors should clearly explain what claims the dataset is intended to support (e.g., improved model performance, fairness, robustness, safety, or other model characteristics), under what assumptions those claims are valid, and what limitations constrain them.
data-centric and benchmarking submissions historically welcomed by the track remain fully in scope. These include, but are not limited to: New datasets and dataset collections
We strongly encourage all authors to release code whenever feasible to promote transparency and reproducibility. However, code release is required at submission when the primary contribution is a reusable executable artifact, such as a benchmark suite, evaluation environment, data generator, or software tool, whose functionality must be inspected in order to evaluate the scientific claims.
Ed's thoughts on where/how to host the catalog
- I think we should use HuggingFace to maximise visibility to ML community, no reason we can't also put on Harvard Dataverse for academic visibility
- HuggingFace will be a good place from which to consume the data, if the Gambit catalog module pulls from an external repo: https://huggingface.co/datasets
- Hosting: HuggingFace has 928K datasets, Harvard has 295K (and 8K "dataverses")
- Seems pretty easy to use the croissant package to generate metadata: https://github.com/mlcommons/croissant although there is a specific way they want us to generate it:
The dataset hosting process as part of submitting to the Evaluations and Datasets Track involves:
- Choosing among 4 options to host your dataset: Harvard Dataverse, Kaggle, Hugging Face, and OpenML
- Using platform tooling to download the automatically generated Croissant file
- Complete the Croissant file with Responsible AI (RAI) metadata. We aim to provide additional tooling for this
- Including a URL to your dataset and uploading the generated Croissant file in OpenReview
- If your submission is accepted: making your dataset public by the camera ready deadline
Questions
- What is the format of our data? Just the set of EFG/NFG files?