🧬 Heterogeneous Biomedical Entity Representation Learning for Gene–Disease Association Prediction

The FusionGDA model introduces an attention-based fusion module to enrich the semantic representations of genes and diseases encoded by pre-trained language models, enabling more accurate gene–disease association (GDA) prediction.

🧩 Framework

⚙️ Installation

# Download and install Anaconda
wget https://repo.anaconda.com/archive/Anaconda3-latest-Linux-x86_64.sh
bash Anaconda3-latest-Linux-x86_64.sh -b
rm Anaconda3-latest-Linux-x86_64.sh
export PATH="/root/anaconda3/bin:$PATH"

# Update Anaconda packages
conda update --all

# Install PyTorch with CUDA support (adjust CUDA version if needed)
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

# Install dependencies
pip install wandb PyTDC lightgbm pytorch-metric-learning

🚀 Execution

Ensure you are in the directory:

~/dpa_pretrain/scripts

Then adjust parameters as required.

🔹 Pre-training Phase

bash run_pretrain_gda_ml_adapter_infoNCE.sh

🔹 Fine-tuning Phase

TDC Dataset

bash run_finetune_gda_lightgbm_infoNCE_tdc.sh

DisGeNET Dataset

bash run_finetune_gda_lightgbm_infoNCE.sh

Results can be tracked through your Weights & Biases account.

📊 Datasets

All datasets are obtained from the following public biomedical repositories:

TDC: https://tdcommons.ai/
DisGeNET: https://www.disgenet.org/

The specific versions used in our experiments are stored in the shared Drive:
👉 Shared Drive Link

📝 Citation

If you find FusionGDA useful for your research, please cite:

@article{meng2024heterogeneous,
  title={Heterogeneous biomedical entity representation learning for gene-disease association prediction},
  author={Meng, Zhaohan and Liu, Siwei and Liang, Shangsong and Jani, Bhautesh and Meng, Zaiqiao},
  journal={Briefings in Bioinformatics},
  volume={25},
  number={5},
  pages={bbae380},
  year={2024},
  publisher={Oxford University Press}
}

🧠 Developed by the AI4BioMed Lab,
School of Computing Science, University of Glasgow, UK 🇬🇧

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
Figure		Figure
data		data
save_model_ckp/pretrain		save_model_ckp/pretrain
scripts		scripts
src		src
.gitignore		.gitignore
README.md		README.md
manual.md		manual.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧬 Heterogeneous Biomedical Entity Representation Learning for Gene–Disease Association Prediction

🧩 Framework

⚙️ Installation

🚀 Execution

🔹 Pre-training Phase

🔹 Fine-tuning Phase

📊 Datasets

📝 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

ZhaohanM/FusionGDA

Folders and files

Latest commit

History

Repository files navigation

🧬 Heterogeneous Biomedical Entity Representation Learning for Gene–Disease Association Prediction

🧩 Framework

⚙️ Installation

🚀 Execution

🔹 Pre-training Phase

🔹 Fine-tuning Phase

📊 Datasets

📝 Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages