Source code for the Nequix foundation model, and Phonon fine-tuning (PFT).
pip install nequixor for torch
pip install nequix[torch]Using nequix.calculator.NequixCalculator, you can perform calculations in
ASE with a pre-trained Nequix model.
from nequix.calculator import NequixCalculator
atoms = ...
atoms.calc = NequixCalculator("nequix-mp-1", backend="jax")or if you want to use the faster PyTorch + kernels backend
...
atoms.calc = NequixCalculator("nequix-mp-1", backend="torch")
...Arguments
model_name(str, default "nequix-mp-1"): Pretrained model alias to load or download.model_path(str | Path, optional): Path to local checkpoint; overridesmodel_name.backend({"jax", "torch"}, default "jax"): Compute backend.capacity_multiplier(float, default 1.1): JAX-only; padding factor to limit recompiles.use_compile(bool, default True): Torch-only; on GPU, usestorch.compile().use_kernel(bool, default True): Torch-only; on GPU, use OpenEquivariance kernels.
Models are trained with the nequix_train command using a single .yml
configuration file:
nequix_train <config>.ymlor for Torch
# Single GPU
uv sync --extra torch
uv run nequix/torch/train.py <config>.yml
# Multi-GPU
uv run torchrun --nproc_per_node=<gpus> nequix/torch/train.py <config>.ymlTo reproduce the training of Nequix-MP-1, first clone the repo and sync the environment:
git clone https://github.com/atomicarchitects/nequix.git
cd nequix
uv syncThen download the MPtrj data from
https://figshare.com/files/43302033 into data/ then run the following to extract the data:
bash data/download_mptrj.shPreprocess the data into .aselmdb files:
uv run scripts/preprocess_data.py data/mptrj-gga-ggapu data/mptrj-aselmdbThen start the training run:
nequix_train configs/nequix-mp-1.ymlThis will take less than 125 hours on a single 4 x A100 node (<25 hours using the torch + kernels backend). The batch_size in the
config is per-device, so you should be able to run this on any number of GPUs
(although hyperparameters like learning rate are often sensitive to global batch
size, so keep in mind).
First sync extra dependencies with
uv sync --extra pftWe provide pretrained model weights for the co-trained (better alignment with
MPtrj) and non co-trained models in models/nequix-mp-1-pft.nqx and
nequix-mp-1-pft-nocotrain.nqx respectively. See nequix-examples for
examples on how to use these models for phonon calculations with both finite
displacement, and analytical Hessians.
Data for the PBE MDR phonon database was originally downloaded and preprocessed with:
bash data/download_pbe_mdr.sh
uv run data/split_pbe_mdr.py
uv run scripts/preprocess_data_phonopy.py data/pbe-mdr/train data/pbe-mdr/train-aselmdb
uv run scripts/preprocess_data_phonopy.py data/pbe-mdr/val data/pbe-mdr/val-aselmdbHowever we provide preprocessed data which can be downloaded with
bash data/download_pbe_mdr_preprocessed.shTo run PFT without co-training run:
uv run nequix/pft/train.py configs/nequix-mp-1-pft-no-cotrain.ymlTo run PFT with co-training run (note this requires mptrj-aselmdb preprocessed):
uv run nequix/pft/train.py configs/nequix-mp-1-pft.yml@article{koker2026pft,
title={{PFT}: Phonon Fine-tuning for Machine Learned Interatomic Potentials},
author={Koker, Teddy and Gangan, Abhijeet and Kotak, Mit and Marian, Jaime and Smidt, Tess},
journal={arXiv preprint arXiv:2601.07742},
year={2026}
}
@article{koker2025training,
title={Training a foundation model for materials on a budget},
author={Koker, Teddy and Kotak, Mit and Smidt, Tess},
journal={arXiv preprint arXiv:2508.16067},
year={2025}
}