Implementation of TEMPO, accepted at NeurIPS 2025.
TEMPO: Temporal Multi-scale Autoregressive Generation of Protein Conformational Ensembles
Yaoyao Xu, Di Wang, Zihan Zhou, Tianshu Yu, Mingchen Chen
School of Data Science, The Chinese University of Hong Kong, Shenzhen & Changping Laboratory, Beijing
Understanding the dynamic behavior of proteins is critical to elucidating their functional mechanisms, yet generating realistic, temporally coherent trajectories of protein ensembles remains a significant challenge. TEMPO introduces a novel hierarchical autoregressive framework for modeling protein dynamics that leverages the intrinsic multi-scale organization of molecular motions.
Key ideas:
- Multi-scale SDE formulation: Protein dynamics are modeled as coupled stochastic differential equations at two temporal scales — slow collective motions and fast local fluctuations.
- Hierarchical generation: A low-resolution model captures major conformational transitions (20 ns intervals), while a high-resolution model fills in detailed local dynamics (1 ns intervals).
- Spatiotemporal encoder: An architecture combining Invariant Point Attention (IPA), multi-head attention, and GRU modules to capture both inter-residue interactions and frame-to-frame temporal dependencies.
Python 3.9
# Core dependencies
pip install numpy==1.21.2 pandas==1.5.3
# PyTorch (CUDA 11.3)
pip install torch==1.12.1+cu113 -f https://download.pytorch.org/whl/torch_stable.html
# Training & structure processing
pip install pytorch_lightning==2.0.4 mdtraj==1.9.9 biopython==1.79
# Additional utilities
pip install wandb dm-tree einops torchdiffeq fair-esm
# Visualization
pip install matplotlib==3.7.2Download the datasets from: https://huggingface.co/Camille1054/datasets
- mdCATH — Total 5,398 proteins with 320K-temperature trajectories (400 frames at 1 ns intervals), we only use 1000 proteins for training.
Train the low-resolution (slow-scale) and high-resolution (fast-scale) models separately:
# Train the low-resolution model (20 ns intervals, captures slow collective motions)
bash script/train_low.sh
# Train the high-resolution model (1 ns intervals, captures fast local fluctuations)
bash script/train_high.shDownload our pretrained mdCATH checkpoints from: https://huggingface.co/Camille1054/TEMPO/tree/main
Generate protein dynamics trajectories using the pretrained models:
bash script/inference.shCompute evaluation metrics on generated trajectories:
# Run analysis (outputs a pickle file with all metrics)
python analysis.py \
--data_dir path_to_generated_xtc \
--num_workers 4 \
--output result.pkl \
--ca_only
# Print the results
python print_result.py result.pkl@inproceedings{xu2025tempo,
title={TEMPO: Temporal Multi-scale Autoregressive Generation of Protein Conformational Ensembles},
author={Xu, Yaoyao and Wang, Di and Zhou, Zihan and Yu, Tianshu and Chen, Mingchen},
booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
year={2025}
}