This repository contains the ExPDRUG pipeline, designed for drug discovery using gene expression data. Below you will find instructions on how to set up and run the pipeline.
- Contains logging functions and result output features.
- Handles file input/output paths and hyperparameter adjustments.
- Manages data processing for model training, including:
- Creating and managing masking matrices between layers.
- Shuffling functionality for permutation tests.
- Defines the neural network model and relevance score computation logic.
- Includes the implementation of the custom loss function.
- Handles model training, k-fold validation, and relevance score computation.
- Implements permutation test functionality for model validation.
- The main script to run the entire pipeline.
- Orchestrates data loading, model training, and interpretation using LRP, IG, or GSEA methods.
- Hetionet: Hetionet GitHub
- Reactome: Reactome Data Download
- COVID-19: COVID-19 Data
- COVID-19 Severity Score Information: Severity Score Information
- GBM: GBM Data
- Alzheimer's: Alzheimer's Data
Ensure that the raw gene expression files for experiments are placed in the data folder.
Run the scripts in the data_processor folder to filter the data for training ExPNet. Refer to the pipeline in that folder for specific instructions.
- Set the file paths and hyperparameters in the
.util/config.pyfile. - Execute the
main.pyscript.
python main.pyRun the scripts in the RWR folder to perform drug discovery.
Install the required packages using the requirements.txt file:
pip install -r requirements.txt