ProStab: Prediction of protein stability change upon mutations by inverse folding and protein language models

News

⭐ Coming soon: Protein stability prediction for multiple mutations and indels(insert and delete). ⭐

Acknowledgements

We sincerely thank the SPURS team for open-sourcing their code and data to the community, and we are grateful to the SPURS team for providing invaluable constructive feedback and guidance throughout this work.

ProStab project builds heavily off of SPURS, and the training pipeline, test pipeline, dataset, configs, baselines, and metrics implementation were adapted from SPURS

Overview

ProStab, a deep learning framework that integrates sequence-derived and structure-informed features for accurate prediction of ∆∆G for protein point mutations given an initial structure. ProStab combines representations from a protein language model applied to both wild-type and mutant sequences, and from the inverse folding model ProteinMPNN applied to the wild-type structure. It jointly models two sources of information: mutation-specific effects, captured as embedding differences at the substitution site between wild-type and mutant sequences; and site-specific priors, derived from the wild-type sequence and structure, which reflect the local context and substitutional tolerance.

Web server

ProStab web server

⭐ Ongoing updates will keep making this web server easier and more convenient to use. ⭐

⭐ We encourage users of our web server to cite both ProStab and SPURS in recognition of the foundational framework and valuable collaborative insights provided by the SPURS team ⭐

Installation

git clone https://github.com/xtanh/ProStab.git
cd ProStab
# Add installation steps if needed

Requirements

conda env create -f environment.yml
conda activate ProStab

Downloading weights and data

Download pre-trained weights from: [https://drive.google.com/file/d/1xZOG3wkn6UGJS_j533laRbZLWV5DU13T/view?usp=share_link]
Extract and place in model_weight/checkpoints/

mkdir -p model_weight/checkpoints
# Place downloaded best.ckpt in model_weight/checkpoints/

ProteinMPNN Weights

Download ProteinMPNN weights: v_48_020.pt from ProteinMPNN GitHub
Create the directory and place the file:

mkdir -p ./data/checkpoints/ThermoMPNN/vanilla_model_weights
# Place v_48_020.pt in ./data/checkpoints/ThermoMPNN/vanilla_model_weights/

Testing

python test.py experiment_path=model_weight  datamodule._target_=megascale data_split=test ckpt_path=model_weight/checkpoints/best.ckpt  mode=predict

Training

python train.py

Inference

from prostab.inference import  parse_pdb, get_prostab
model, cfg = get_prostab('./model_weight')
pdb_name = 'example'
pdb_path = '/home/xy_th/PROSTAB/data/inference_example/1A0N.pdb'
chain = 'A'
mutation = "V1A"  
pdb_mut = parse_pdb(pdb_path, pdb_name, chain, cfg, mutation=mutation)
result_mutant = model(pdb_mut, return_logist=True)
print(f"mutation {mutation} score: {result_mutant.item()}")

Data

Required data files (not included in repository):

📄 License

This project builds heavily upon SPURS. Please refer to their original license for more details.

Acknowledgments

We gratefully acknowledge the following projects and contributions that made ProStab possible:

Foundation Framework

Important Note: This project is significantly based on the SPURS framework. We gratefully acknowledge that:

Benchmark Datasets: All evaluation datasets originate from SPURS
Training Strategy: Our training methodology follows SPURS protocols
Evaluation Framework: Assessment metrics and procedures are consistent with SPURS
Code Foundation: The training and evaluation framework is built upon SPURS

We sincerely thank the SPURS team for providing an extensible training, evaluation, and modeling framework that enabled this research.

Key Components

SPURS: Foundational framework from Luo Group
ProMEP: Transformer encoder fuse strutural information and sequence information Cheng et al
ESM: Protein language models from Meta AI
ProteinMPNN: Inverse folding model from Dauparas et al.

📚 Citation

If you use ProStab in your research, please cite our work and the foundational papers:

@article{prostab2025,
  title={ProStab: Prediction of protein stability change upon mutations by inverse folding and protein language models},
  author={Tan, Hong and Wei, Xiaowei and Lin, Shenggeng and Mao, Xueying and Chen, Junwei and Sun, Heqi and Zhang, Yufang and Zhou, Zhenghong and Wei, Dong-Qing and Lin, Shuangjun and Xiong, Yi},
  journal={bioRxiv},
  year={2025},
  doi={10.1101/2025.08.11.669595},
  url={https://doi.org/10.1101/2025.08.11.669595}
}

@article{li2025generalizable,
  title={Generalizable and scalable protein stability prediction with rewired protein generative models},
  author={Li, Ziang and Luo, Yunan},
  journal={Nature Communications},
  year={2025},
  publisher={Nature Publishing Group UK London}
}

@article{thermompnn2024,
  title={Transfer learning to leverage larger datasets for improved prediction of protein stability changes},
  author={Dieckhaus, Henry and Brocidiacono, Michael and Randolph, Nicholas Z. and Kuhlman, Brian},
  journal={Proceedings of the National Academy of Sciences},
  volume={121},
  number={6},
  pages={e2314853121},
  year={2024},
  doi={10.1073/pnas.2314853121},
  url={https://www.pnas.org/doi/abs/10.1073/pnas.2314853121}
}

Reference

Our work is based on the following papers.

@article{li2025generalizable,
  title={Generalizable and scalable protein stability prediction with rewired protein generative models},
  author={Li, Ziang and Luo, Yunan},
  journal={Nature Communications},
  year={2025},
  publisher={Nature Publishing Group UK London}
}

@article{thermompnn2024,
  title={Transfer learning to leverage larger datasets for improved prediction of protein stability changes},
  author={Dieckhaus, Henry and Brocidiacono, Michael and Randolph, Nicholas Z. and Kuhlman, Brian},
  journal={Proceedings of the National Academy of Sciences},
  volume={121},
  number={6},
  pages={e2314853121},
  year={2024},
  doi={10.1073/pnas.2314853121},
  url={https://www.pnas.org/doi/abs/10.1073/pnas.2314853121}
}


@article{cheng2024zero,
  title={Zero-shot prediction of mutation effects with multimodal deep representation learning guides protein engineering},
  author={Cheng, Peng and Mao, Cong and Tang, Jin and Yang, Sen and Cheng, Yu and Wang, Wuke and Gu, Qiuxi and Han, Wei and Chen, Hao and Li, Sihan and others},
  journal={Cell Research},
  volume={34},
  number={9},
  pages={630--647},
  year={2024},
  publisher={Springer Nature Singapore Singapore}
}


@article{rives2021biological,
  title={Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences},
  author={Rives, Alexander and Meier, Joshua and Sercu, Tom and Goyal, Siddharth and Lin, Zeming and Liu, Jason and Guo, Demi and Ott, Myle and Zitnick, C Lawrence and Ma, Jerry and others},
  journal={Proceedings of the National Academy of Sciences},
  volume={118},
  number={15},
  pages={e2016239118},
  year={2021},
  publisher={National Acad Sciences},
  note={bioRxiv 10.1101/622803},
  doi={10.1073/pnas.2016239118},
  url={https://www.pnas.org/doi/full/10.1073/pnas.2016239118},
}

@inproceedings{zheng2023lm_design,
    title={Structure-informed Language Models Are Protein Designers},
    author={Zheng, Zaixiang and Deng, Yifan and Xue, Dongyu and Zhou, Yi and YE, Fei and Gu, Quanquan},
    booktitle={International Conference on Machine Learning},
    year={2023}
}

@article{dauparas2022robust,
  title={Robust deep learning--based protein sequence design using ProteinMPNN},
  author={Dauparas, Justas and Anishchenko, Ivan and Bennett, Nathaniel and Bai, Hua and Ragotte, Robert J and Milles, Lukas F and Wicky, Basile IM and Courbet, Alexis and de Haas, Rob J and Bethel, Neville and others},
  journal={Science},
  volume={378},
  number={6615},  
  pages={49--56},
  year={2022},
  publisher={American Association for the Advancement of Science}
}

⭐ If you find ProStab useful, please star this repository! ⭐

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ProStab: Prediction of protein stability change upon mutations by inverse folding and protein language models

News

Acknowledgements

Overview

Web server

Installation

Requirements

Downloading weights and data

Testing

Training

Inference

Data

📄 License

Acknowledgments

Foundation Framework

Key Components

📚 Citation

Reference

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
assets		assets
configs		configs
data		data
model_weight		model_weight
prostab		prostab
README.md		README.md
config_tree.log		config_tree.log
environment.yml		environment.yml
inference.ipynb		inference.ipynb
test.py		test.py
train.py		train.py

xtanh/ProStab

Folders and files

Latest commit

History

Repository files navigation

ProStab: Prediction of protein stability change upon mutations by inverse folding and protein language models

News

Acknowledgements

Overview

Web server

Installation

Requirements

Downloading weights and data

Testing

Training

Inference

Data

📄 License

Acknowledgments

Foundation Framework

Key Components

📚 Citation

Reference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages