This repo is a very basic example of fine-tuning an MLIP and testing it against a reaction barrier prediction task.
This was a heavily vibe-coded proof of concept. Almost all the code in here was written by Claude 4.5. I did not clean it much afterwards.
- Create a bigger finetuning set (Only had ~100 reactions in my filtered dataset)
- Run on bigger MACE variants, not just small
- Do multihead replay finetuning instead of this "naive approach"
- See if the finetuned model regressed on other tasks
- See if I can finetune other model architectures
Step 1: Download part of the OMOL25 dataset locally. Specifically the validation set, val.tar.gz. It's ~20GB. See DATASET.md in their Hugging Face repo for more details.
Step 2: Extract some data for fine-tuning. Use extract_data.py to pull out a subset of the big dataset that will be useful specifically for examining reactions with transition metals. I did this locally. This is in extract_reactions_from_omol.py.
Step 3: Fine-tune MACE on the data. I did this on Modal. This is in finetune_mace_on_modal.py.
Step 4: Evaluate the "plain" and fine-tuned small MACE variants side by side. I did this on Colab, and a copy of the notebook is in evaluation_on_rxn_barrier_prediction_task.ipynb.
Plain model results
Fine-tuned model results

