This repository contains the implementation code for our research paper, [DPO-LLPS: Biologically-informed hierarchical transfer learning Strategy for Designing Phase Separation–Driving Proteins]

The framework combines hierarchical transfer learning and generative modeling for LLPS protein design. Localization fine-tuning encodes compartment-specific “chemical grammar,” while DPO captures LLPS-driving “molecular grammar.” The model generates novel proteins with targeted localization, tunable phase behavior, and validated condensate stability, enabling programmable and mechanistically interpretable LLPS design.
conda env create -f env.yml
conda activate DPO-LLPSModel checkpoints and complete datasets are available on Zenodo.
To train a model with the default configuration (e.g., DPO-LLPS), simply run:
python train_llps_dpo_multigpu.pyTo perform inference with the default configuration (e.g., DPO-LLPS), run:
python LLPS-DPO-inference.pyThe processing procedures for all datasets can be found in the data_process folder.
If you find the models useful in your research, please cite our paper.
If you have any question, please feel free to email us (yangyangzhang@zju.edu.cn).