- 1. Introductions
- 2. Installation
- 3. Quickstart: Examples
- 4. Training
- 5. Predicting
- 6. Foundation Models
- 7. Parallelization
- 8. LAMMPS Interface
- 9. Modules
- Troubleshooting
- Issues
- Maintainer
- References
IANN (InterAtomic Neural Network framework) is an equivariant interatomic neural network potential framework package for materials science and computational chemistry. It implements state-of-the-art graph neural network models for periodic and non-periodic systems, including FastPot, PaiNN, Nequip, MACE, and EquiformerV2, focusing on predicting energies and forces with high accuracy.
Key features:
- Easy to use and to switch models
- Multiple equivariant interatomic neural network models implementation
- High-accuracy energy and force predictions
- Distributed training on multiple GPUs and multiple server nodes
- Integration with ASE and LAMMPS for molecular dynamics simulations
- Customizable model architectures
A documentation is available at: https://iann.readthedocs.io
- Python 3.7+
- PyTorch 1.9+
# Clone the repository
git clone https://github.com/changzhiai/IANN.git
cd IANN
# Install with pip
pip install -e .For GPU acceleration, make sure you have CUDA installed and PyTorch with CUDA support:
# Check if PyTorch is using CUDA
python -c "import torch; print(torch.cuda.is_available())"The quickest way to get started with IANN is to run the example script:
# Run the quickstart example
python examples/quickstart.pyThis script demonstrates:
- Loading a dataset
- Creating and training a model
- Using the model for predictions
Check out the examples/ directory for more sample scripts and tutorials.
IANN works with ASE database (.db) or trajectory (.traj) files. Ensure your data contains atomic structures with energy and force labels.
Create train.py
from iann.trainer import Trainer
# Define the Trainer
trainer = Trainer(
model="painn",
config={"device": "cpu",
'output_dir': 'output',
'output_log': 'output.log',
'output_model': 'model.pt',
},
distributed=False
)
# Run the training
trainer.train("dataset.traj")Available models for model:
- fastpot
- painn
- nequip
- mace
- equiformerV2
Default configurations for config:
config = {
# parameters for model
"num_channels": 128, # number of channels in the model
"num_layers": 3, # number of layers in the model
"cutoff": 5.5, # cutoff radius
# parameters for trainer
"device": None, # override device, e.g. 'cpu' or 'cuda'
"val_ratio": 0.1, # validation ratio
"batch_size": 12, # batch size
"learning_rate": 0.0001, # initial learning rate
"forces_weight": 0.9, # weight for forces
"load_model": False, # load model from checkpoint
"max_steps": 1000000, # maximum number of steps
"max_epochs": None, # None if setup max_steps, otherwise max_epochs
"optimizer_type": "adam", # optimizer type: "adam", "sgd", "rmsprop", "adagrad", "adadelta", "adamax", "adamw"
"max_grad_norm": None, # gradient clipping norm
"log_interval": 2000, # log interval
"stop_patience": 200, # patience for early stopping
"scheduler_type": "LambdaLR", # scheduler type: "ReduceLROnPlateau", "LambdaLR", "CosineAnnealingLR", "CosineAnnealingWarmRestarts", "StepLR", "MultiStepLR", "ExponentialLR"
# parameters for data
"random_seed": 666, # random seed for reproducibility
"save_split": False, # save split file name
"load_split": False, # load split file name
"norm_data": False, # normalize data
"norm_per_atom": False, # normalize data per atom
# parameters for DDP (Parallelization)
"dist_timeout": 600, # timeout (seconds) for distributed operations
"master_port": 12356, # port for distributed operations
# parameters for output
"output_dir": "output", # output directory
"output_log": "output.log", # log file
"output_model": "model.pt", # model file
"log_input": False, # log your costomized input config
"debug": False, # debug mode
}Note
There are more parameters for each model, please refer to the documentation or source code for details.
Training logs will be saved in the specified output directory. You can monitor:
- Energy and force prediction errors
- Training and validation losses
- Model checkpoints
from iann.calculators import MLCalculator
from ase.io import read
# Create calculator with model path
calc = MLCalculator("model.pt")
# Read structures
images = read("test_structures.traj", ":")
# Make predictions
for atoms in images:
atoms.calc = calc
energy = atoms.get_potential_energy()
forces = atoms.get_forces()
print(f"Energy: {energy} eV")
print(f"Forces: {forces} eV/Å")Tip
EnsembleCalculator and AtomicEnsembleCalculator are available to get uncertainty for each structure and each atom, seperately.
IANN provides pre-trained foundation models (painn) that you can use out-of-the-box or fine-tune for your specific tasks.
To use a foundation model for predictions:
from iann.foundations import foundation_model
from iann.calculators import MLCalculator
from ase.build import fcc100
calc = MLCalculator(
model_path=foundation_model("painn_oc.pt"), # foundation model trained on OC20+OC22
compute_forces=True,
device='cpu') # use 'cuda' for GPU
atoms = fcc100("Pt", size=(4,4,3), a=5.5, vacuum=15.0)
atoms.calc = calc
nnp_energy = atoms.get_potential_energy()
nnp_forces = atoms.get_forces()
print(f"NNP Energy: {nnp_energy:.4f} eV")
print(f"NNP Forces: {nnp_forces}")You can fine-tune a foundation model on your own data:
from iann.trainer import Trainer
from iann.foundations import foundation_model
trainer = Trainer(model="painn",
config={"num_channels": 128, # number of channels in the model
"num_layers": 3, # number of layers in the model
"cutoff": 5.5, # cutoff radius
"batch_size": 16, # batch size
"learning_rate": 0.0001, # initial learning rate
"forces_weight": 0.9, # weight for forces
"load_model": foundation_model("painn_oc.pt"), # load the foundation model
"max_steps": 10000000, # maximum number of steps
"random_seed": 888, # random seed for reproducibility
"val_ratio": 0.003, # validation ratio
"stop_patience": 500, # patience for early stopping
'device': 'cuda',
'output_dir': 'output',
'output_log': 'output.log',
'output_model': 'model.pt'},
distributed=False)
trainer.train("dataset.traj")IANN supports distributed training using PyTorch's Distributed Data Parallel (DDP).
Submit to multiple GPUs (in SLURM Workload Manager)
# Run on multiple GPUs and multiple nodes
#!/bin/bash
#SBATCH -N 2 # Number of nodes
#SBATCH -C gpu # Use GPU nodes
#SBATCH -q debug # Use regular/debug queue
#SBATCH -t 00:30:00 # Time limit
#SBATCH -A m2997 # Your account
#SBATCH --gpus-per-node=4 # GPUs per node
#SBATCH --ntasks-per-node=4 # Number of tasks per node
#SBATCH --cpus-per-task=1 # Number of CPUs per task
module load your_modules
export GPUS_PER_NODE=$SLURM_GPUS_ON_NODE
export NNODES=$SLURM_NNODES
srun -N $NNODES -n $((NNODES*GPUS_PER_NODE)) python train.pySubmit to multiple CPUs (in SLURM Workload Manager)
# Run on multiple CPUs and multiple nodes
#!/bin/bash
#SBATCH -N 2 # Number of nodes
#SBATCH -C cpu # Use CPU nodes
#SBATCH -q debug # Use regular/debug queue
#SBATCH -t 00:30:00 # Time limit
#SBATCH -A m2997 # Your account
#SBATCH --ntasks-per-node=1 # Number of tasks per node
#SBATCH --cpus-per-task=128 # Number of CPUs per task
module load your_modules
export GPUS_PER_NODE=$SLURM_GPUS_ON_NODE
export NNODES=$SLURM_NNODES
srun -N $NNODES -n $((NNODES*GPUS_PER_NODE)) python train.py#!/bin/bash
#SBATCH -N 2 # Number of nodes
#SBATCH -C gpu # Use GPU nodes
#SBATCH -q debug # Use regular/debug queue
#SBATCH -t 00:20:00 # Time limit
#SBATCH -A m2997 # Your account
#SBATCH --gpus-per-node=4 # GPUs per node
#SBATCH --ntasks-per-node=4
#SBATCH --cpus-per-task=1
export PYTHONPATH=/pscratch/sd/c/changzhi/softwares/IANN_v2/IANN/:$PYTHONPATH
module purge
module load PrgEnv-nvidia; module load openmpi;
export GPUS_PER_NODE=$SLURM_GPUS_ON_NODE
export NNODES=$SLURM_NNODES
export FI_CXI_RDZV_GET_MIN=0 # vender bugs fixed on nersc for multiple nodes
export FI_CXI_SAFE_DEVMEM_COPY_THRESHOLD=16777216 # vender bugs fixed on nersc
srun -N $NNODES -n $((NNODES*GPUS_PER_NODE)) \
python train.py
Note
the parallelization parameters are automatically obtained from the SLURM environment variables.
- Use the largest batch size that fits in your GPU memory
- Enable mixed precision training for faster performance
- Monitor GPU utilization to ensure efficient resource use
IANN models can be used as interatomic potentials in LAMMPS molecular dynamics simulations (Support GPU).
Warning
You have to install IANN plugins for LAMMPS first, if you want to use IANN models with LAMMPS. Please see the documentation in LAMMPS interface section.
First, you need to have a trained model with torch format, which can be obtained by running the training script. Then convert the model to the torchscript format as follows:
from iann.plugins.converter import convert_model_for_lammps
convert_model_for_lammps(model_path='best_model.pt',
model_type='painn',
output_path='output_model.pt')To run the LAMMPS simulation with IANN, you can use the following script:
# LAMMPS input script example
# Define the units and the atom style
units metal
atom_style atomic
# Define the boundary conditions
boundary p p p
# Read the initial structure
read_data initial.data
# Define the IANN pair style
pair_style iann painn model_lmp.pt 5.5
pair_coeff * *
# Define the mass of the atoms
mass 1 1.0079999997406976 # H
mass 2 195.08399994981576 # Pt
# Define the neighbor list
neighbor 0.5 bin
neigh_modify every 1 delay 0 check yes
# Thermodynamic settings
thermo 10
# Initial minimization to relax the system before dynamics
minimize 1.0e-4 1.0e-6 100 1000
# Define the timestep and the thermostat
timestep 0.001
fix 1 all nvt temp 300.0 300.0 0.1
# Define the dump frequency and the dump file
dump 1 all custom 10 dump.xyz id type x y z
# Run the simulation
run 5000Note
Multiple GPUs prediction (inference) are supported by using the pair_style iann/multi_gpu command. It will automatically detect the number of GPUs per node and use them to run the model.
First, you need to have several trained models with torch format, which can be obtained by running several training scripts. Then convert the models to the torchscript format as follows:
from iann.plugins.converter import convert_models_for_lammps
# Give a list of models
model_paths = ["model_1.pt", "model_2.pt"]
# Convert the models to a torchscript model
output_path = convert_models_for_lammps(
model_paths=model_paths,
model_type="painn", # if not specified, the model type will be inferred from the model file
output_path="model_ensemble_lmp.pt"
)To run the ensemble LAMMPS simulation with IANN, you can use the following script:
# LAMMPS input script example
# Define the units and the atom style
units metal
atom_style atomic
# Define the boundary conditions
boundary p p p
# Read the initial structure
read_data initial.data
# Define the IANN pair style
pair_style iann painn model_ensemble_lmp.pt 5.5
pair_coeff * *
# Define the mass of the atoms
mass 1 1.0079999997406976 # H
mass 2 195.08399994981576 # Pt
# Define the neighbor list
neighbor 0.5 bin
neigh_modify every 1 delay 0 check yes
# Compute the variance mode of the energy and force of the ensemble model
compute variance all iann/variance
# Define the thermodynamic style
thermo_style custom step pe ke etotal temp press c_variance[1] c_variance[2] c_variance[3] c_variance[4]
# Define the thermodynamic modify
thermo_modify colname c_variance[1] energy_var
thermo_modify colname c_variance[2] force_var
thermo_modify colname c_variance[3] max_energy_var
thermo_modify colname c_variance[4] max_force_var
thermo_modify flush yes
# Thermodynamic settings
thermo 100
# Initial minimization to relax the system before dynamics
minimize 1.0e-4 1.0e-6 100 1000
# Define the timestep and the thermostat
timestep 0.001
fix 1 all nvt temp 300.0 300.0 0.1
# Define the dump frequency and the dump file
dump 1 all custom 10 dump.xyz id type x y z
# Run the simulation
run 5000IANN is organized into several key modules:
Data handling utilities:
AtomsData: Data object for each atomsAseDataset: Dataset class for handling atomic structures
Contains neural network model implementations:
FastPot: FastPot model implementation for energy and force predictionPaiNN: PaiNN model implementation for energy and force predictionNequip: Nequip model implementation for energy and force predictionMACE: MACE model implementation for energy and force predictionEquiformerV2: EquiformerV2 model implementation for energy and force prediction
ASE calculators implementations:
MLCalculator: ASE calculator interface for modelsEnsembleCalculator: ASE ensemble calculator interface for modelsAtomicEnsembleCalculator: ASE atomic ensemble calculator interface for models
Tools for converting models and LAMMPS integration:
converter: Model conversion utilities for LAMMPS integrationEnsembleLAMMPSModelWrapper: Wrapper class for adapting ensemble model inputs/outputs for LAMMPSLAMMPSModelWrapper: Wrapper class for adapting model inputs/outputs for LAMMPSconvert_model_for_lammps: Function to convert trained model to TorchScript formatconvert_models_for_lammps: Function to convert trained ensemble models to TorchScript format
C++ plugins for LAMMPS molecular dynamics simulations:
PairIANN: Single GPU pair style for IANN potentialsPairIANNMultiGPU: Multiple GPU pair style for IANN potentialsComputeIANNVariance: Compute style for variance calculations
- Memory Issues: Reduce batch size or model size if you encounter OOM errors
- Training Instability: Try reducing learning rate or using gradient clipping
- Poor Performance: Try increasing model capacity
For questions, issues, and contributions, please use the GitHub issue tracker
Maintainer Dr. Changzhi Ai (changzhi@stanford.edu) at SUNCAT center, Stanford University and SLAC, who is supervised by Dr. Johannes Voss and Dr. Frank Abild-Pedersen.
[1] K. T. Schütt, et al. "Equivariant message passing for the prediction of tensorial properties and molecular spectra", arXiv:2102.03150 (2021). Link
[2] S. Batzner, et al. "E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials", Nature Communications, 13, 2453 (2022). Link
[3] I. Batatia, et al. "MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields", arXiv:2206.07697 (2022). Link
[4] Y. L. Liao, et al. "EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations", arXiv:2306.12059 (2023). Link
[5] X. Yang, et al. "CURATOR: Building Robust Machine Learning Potentials for Atomistic Simulations Autonomously with Batch Active Learning", ChemRxiv (2024). Link