GPCRact

This repository serves as the official implementation and reproducibility package for the paper "GPCRact: a hierarchical framework for predicting ligand-induced GPCR activity via allosteric communication modeling".

We provide the complete source code, preprocessed datasets, training scripts, and analysis notebooks required to reproduce the findings presented in the manuscript.

📁 Repository Structure

We have unified all resources into a single structured repository to facilitate full reproducibility.

GPCRact/
├── analysis/           # Jupyter Notebooks for reproducing figures and statistical analyses
├── benchmarks/         # Implementation of baseline models (DeepREAL, AiGPro, 3D-GNN)
├── configs/            # Configuration files (YAML) for training and HPO
├── data/               # Datasets
│   ├── raw/            # Raw data files (GPCRactDB v1)
│   ├── resources/      # Auxiliary bio-info files (PDB info, MSA, etc.)
│   └── splits/         # Exact Train/Val/Test scaffold splits used in the paper
├── preprocessing/      # Scripts to reconstruct the dataset from scratch
├── scripts/            # Executable scripts for Training, Inference, and HPO
├── src/                # Core library code (Model architecture, Layers, Dataloaders)
├── environment.yml     # Conda environment file
└── README.md           # Master documentation

⚙️ Installation

We recommend using Conda to manage the environment for full reproducibility.

Clone the repository:

git clone https://github.com/hyojin0912/HJ-GPCRact.git
cd HJ-GPCRact

Create and activate the Conda environment:
```
conda env create -f environment.yml
conda activate gpcract
```
Alternatively, you can install packages using pip:
```
pip install -r requirements.txt
```

🔬 Reproducibility Workflow

This section explicitly delineates the steps to reproduce the results reported in our study.

Step 1: Data Construction

Users can reconstruct the GPCRactDB from raw public data or use the pre-generated splits provided in data/splits/. To build from scratch, follow the pipeline in the preprocessing/ directory:

# Example: Running the final dataset creation step
jupyter notebook preprocessing/04_create_final_dataset.ipynb

Note: The exact scaffold-based split files (scaffold_train.csv, scaffold_val.csv, scaffold_test.csv) used in our study are already provided in data/splits/ to ensure fair benchmarking.

Step 2: Training the Model 🏋️‍♂️

To train the GPCRact model from scratch using the provided splits:

Configure: Modify configs/training_config.yaml if necessary.
Run: Execute the training script.

python scripts/train.py \
    --data_dir data/splits \
    --save_dir checkpoints/ \
    --epochs 100

For detailed arguments, see scripts/README.md.

Step 3: Inference 🚀

To predict the activity (Agonist/Antagonist/Non-binder) of novel GPCR-ligand pairs using a trained model:

python scripts/inference.py \
    --data_dir data/splits \
    --model_path checkpoints/best_model.pt \
    --output_dir results/

Step 4: Benchmarking 📊

We provide the full source code and execution scripts for the baseline models compared in the manuscript (DeepREAL, AiGPro, 3D-GNN). All baselines were retrained on the identical GPCRact dataset.

DeepREAL: See benchmarks/DeepREAL/
AiGPro: See benchmarks/AiGPro/ (Docker support included)
3D-GNN Baseline: See benchmarks/3D-GNN/

Step 5: Analysis & Figure Generation 📉

To reproduce the statistical analyses, mechanistic interpretations, and main figures (Fig 1, 3, 4, 7), run the notebooks in the analysis/ directory.

01_receptor_dynamics_analysis.ipynb: Structural ground truth analysis (Fig 1).
02_sequence_structure_correlation.ipynb: MSA vs. 3D dynamics (Fig 3).
03_activity_decision_tree.ipynb: Decision tree for activity rules (Fig 4).
04_mechanistic_interpretability.ipynb: Attention weight analysis (Fig 7).

Supplementary Validations: PRS analysis, Sensitivity analysis, and Mutation studies are also included.

🎓 Citation

Our manuscript is currently under review. If you use GPCRact in your research, we would appreciate it if you could cite our work upon its publication.

📬 Contact

For questions, bug reports, or feedback, please contact Hyojin Son at hyojin0912@kaist.ac.kr.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GPCRact

📋 Table of Contents

📁 Repository Structure

⚙️ Installation

🔬 Reproducibility Workflow

Step 1: Data Construction

Step 2: Training the Model 🏋️‍♂️

Step 3: Inference 🚀

Step 4: Benchmarking 📊

Step 5: Analysis & Figure Generation 📉

🎓 Citation

📬 Contact

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 198 Commits
analysis		analysis
benchmarks		benchmarks
configs		configs
data		data
preprocessing		preprocessing
scripts		scripts
src		src
README.md		README.md
environment.yml		environment.yml
requirements.txt		requirements.txt

hyojin0912/HJ-GPCRact

Folders and files

Latest commit

History

Repository files navigation

GPCRact

📋 Table of Contents

📁 Repository Structure

⚙️ Installation

🔬 Reproducibility Workflow

Step 1: Data Construction

Step 2: Training the Model 🏋️‍♂️

Step 3: Inference 🚀

Step 4: Benchmarking 📊

Step 5: Analysis & Figure Generation 📉

🎓 Citation

📬 Contact

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages