-
Notifications
You must be signed in to change notification settings - Fork 4
Add Matbench Discovery benchmark #651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… compute, multi-gpu support
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #651 +/- ##
===========================================
- Coverage 39.30% 28.81% -10.49%
===========================================
Files 30 35 +5
Lines 1743 2513 +770
===========================================
+ Hits 685 724 +39
- Misses 1058 1789 +731 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Overview
The PR creates the inaugural benchmarking feature for Garden with Matbench Discovery.
Discussion
Everything lives in a new
garden_ai.benchmarksmodule. Here is an example script running the full matbench discovery benchamrk on MACE (more in thegarden_ai/benchmarks/matbench_discovery/examples/folder):The benchmark tasks are implemented as
@hog.method()s on theMatbenchDiscoveryclass. This makes it easy to run on remote sites through globus-compute, or for someone to use the garden SDK directly on a system they have access to using a.local()call. This has the downside that thetasks.pyfile is pretty huge since we need all of the logic for running the benchmark and calculating the metrics all in one file, but has the benefit that we don't need to addmatbench_discoveryas a dependency for the garden SDK since groundhog will install it in the venv it creates to run the functions.I implemented the full list of tasks, defined by the
matbench_discovery.enums.Taskenum:So in theory, any model that can be used as an ASE calculator can run the benchmark no matter what it is trained to do. I have only tested MACE, SevenNet, Mattersim, and EquiformerV2.
The general idea is that you pass in a
model_factorywhich is a function that build and returns a model instance, and themodel_packageswhich is the list of python packages the model factory needs to run. These are called by the tasks in the venv to setup the model for benchmarking. It currently only supports models that are pip-installable, but shouldn't be too hard to pull down an instantiate a model from git (or the future project formerly known as Graft).Since it takes ~20 gpu hours to run the full benchmark, I implemented a checkpoint/resume system that writes calculated energies and the index of each processed structure to a JSON file in
~/.garden/benchmarks/on the system running the benchmark. If you give a.submit(), .remote(), .local()call acheckpoint_pathkwarg, it will look there for an existing checkpoint file and figure out which structures have already been processed, and resume from there. We print the checkpoint path to stdout when the job starts, and also attach it to the future we get back from a.submit()call. So you can grab the checkpoint path like this:Metrics Calculations
I reuse the metric calculation functions matbench uses internally, but had issues importing the function directly from matbench, so I elected to copy the implementation into the
tasks.pyfile. We reproduce the key metrics in the official matbench leaderboard and add some of our own more 'meta' metrics. Here is an example blob of metrics from a MACE run:TODO add some metrics when the running job finishesPublishing results
Super users can publish results to the official garden leader board using the
publish_benchmark_resulthelper function. Regular users can call the function, but the backend will reject non-super user's requests.Testing
Manually tested a bunch of times using the scripts in
garden_ai/benchmarks/matbench_discovery/examples/Documentation
No documentation updates yet. But I will need to write up some tutorials to help users figure out how to run/debug it.
📚 Documentation preview 📚: https://garden-ai--651.org.readthedocs.build/en/651/