Skip to content

LOGO-CUHKSZ/TEMPO

Repository files navigation

TEMPO: Temporal Multi-scale Autoregressive Generation of Protein Conformational Ensembles

NeurIPS 2025 Python 3.9 License: MIT

Implementation of TEMPO, accepted at NeurIPS 2025.

TEMPO: Temporal Multi-scale Autoregressive Generation of Protein Conformational Ensembles

Yaoyao Xu, Di Wang, Zihan Zhou, Tianshu Yu, Mingchen Chen

School of Data Science, The Chinese University of Hong Kong, Shenzhen & Changping Laboratory, Beijing

Overview

Understanding the dynamic behavior of proteins is critical to elucidating their functional mechanisms, yet generating realistic, temporally coherent trajectories of protein ensembles remains a significant challenge. TEMPO introduces a novel hierarchical autoregressive framework for modeling protein dynamics that leverages the intrinsic multi-scale organization of molecular motions.

Key ideas:

  • Multi-scale SDE formulation: Protein dynamics are modeled as coupled stochastic differential equations at two temporal scales — slow collective motions and fast local fluctuations.
  • Hierarchical generation: A low-resolution model captures major conformational transitions (20 ns intervals), while a high-resolution model fills in detailed local dynamics (1 ns intervals).
  • Spatiotemporal encoder: An architecture combining Invariant Point Attention (IPA), multi-head attention, and GRU modules to capture both inter-residue interactions and frame-to-frame temporal dependencies.

Installation

Python 3.9

# Core dependencies
pip install numpy==1.21.2 pandas==1.5.3

# PyTorch (CUDA 11.3)
pip install torch==1.12.1+cu113 -f https://download.pytorch.org/whl/torch_stable.html

# Training & structure processing
pip install pytorch_lightning==2.0.4 mdtraj==1.9.9 biopython==1.79

# Additional utilities
pip install wandb dm-tree einops torchdiffeq fair-esm

# Visualization
pip install matplotlib==3.7.2

Dataset

Download the datasets from: https://huggingface.co/Camille1054/datasets

  • mdCATH — Total 5,398 proteins with 320K-temperature trajectories (400 frames at 1 ns intervals), we only use 1000 proteins for training.

Training

Train the low-resolution (slow-scale) and high-resolution (fast-scale) models separately:

# Train the low-resolution model (20 ns intervals, captures slow collective motions)
bash script/train_low.sh

# Train the high-resolution model (1 ns intervals, captures fast local fluctuations)
bash script/train_high.sh

Pretrained Checkpoints

Download our pretrained mdCATH checkpoints from: https://huggingface.co/Camille1054/TEMPO/tree/main

Inference

Generate protein dynamics trajectories using the pretrained models:

bash script/inference.sh

Analysis & Evaluation

Compute evaluation metrics on generated trajectories:

# Run analysis (outputs a pickle file with all metrics)
python analysis.py \
    --data_dir path_to_generated_xtc \
    --num_workers 4 \
    --output result.pkl \
    --ca_only

# Print the results
python print_result.py result.pkl

Citation

@inproceedings{xu2025tempo,
  title={TEMPO: Temporal Multi-scale Autoregressive Generation of Protein Conformational Ensembles},
  author={Xu, Yaoyao and Wang, Di and Zhou, Zihan and Yu, Tianshu and Chen, Mingchen},
  booktitle={Advances in Neural Information Processing Systems (NeurIPS)},
  year={2025}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors