ERRalpha-Predictor

A Framework of Ensemble Models for Predicting ERRα Binders, Antagonists, and Agonists Using Artificial Intelligence. You can use it simply at https://err1p-web.streamlit.app

Overview:

datasets/ contains the three dataset files ("xxx_train" used for 8:1:1 data splitting, "xxx_external" used for external validation);
ensemble_models/ take the external validation set of three datasets as examples to use ERRα-Predictor;
graph_models/ contains the final selected GNN models for the three datasets;
ml_final_models/ contains the final selected ML models for three datasets;
mmpa_and_representative_substructure/ contains reference repository code
streamlit/ contains local web deployment of ERRα-Predictor;
supplementary_information/ contains supplementary materials;
ml_best_hyperparameters.pkl optimal hyperparameters for ML models;
best_hyperparameters.json optimal hyperparameters for GNN models;
GNNExplainer_viz.ipynb visualization of GNN models

Conda environments:

python = 3.7
deepchem = 2.7.1
imbalanced-learn = 0.7.0
lightgbm = 4.5.0
numpy = 1.21.6
optuna = 4.0.0
pytorch = 1.13.1
pytorch-cuda = 11.7
rdkit = 2023.3.2
scikit-learn = 1.0.2
torch-geometric = 2.3.1
xgboost = 1.6.2
mol2vec = 0.2.2
openpyxl = 3.1.3
streamlit = 1.23.1
streamlit-ketcher = 0.0.1

Experiments:

ML models

You need to download the mol2vec pkl file from here first, then for the three datasets, run the machine learning codes molecular_representation.py and model_ml_construct.py, get representation and model training results respectively.

GNN models

Given the json file of the best hyperparameters, you only need to run model_graph_merge_hyper.py; if you want to try this hyperparameter exploration process, run graph_hyper_tune.py.

Ensemble models

You can look at the examples in ensemble_models：for binders, antagonists, and agonists, ERRα-Predictor was used to predict them respectively.

Usage:

If you want to use ERRα-Predictor, after configuring the required Python environment and downloading the mol2vec pkl file, it is recommended that you enter the streamlit for local web deployment.

Use the command:

cd streamlit
streamlit run app.py

The web page incorporates functionalities for data retrieval and prediction. The former involves querying the collected dataset of ERRα ligands, while the latter utilizes ERRα-Predictor models to perform predictions.
Users can choose to submit either a single SMILES string or multiple SMILES strings for querying or prediction, and the results can be downloaded.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ERRalpha-Predictor

Overview:

Conda environments:

Experiments:

ML models

GNN models

Ensemble models

Usage:

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
datasets		datasets
ensembl_models		ensembl_models
graph_models		graph_models
ml_final_models		ml_final_models
mmpa_and_representative_substructure		mmpa_and_representative_substructure
streamlit		streamlit
supplementary_information		supplementary_information
GNNExplainer_viz.ipynb		GNNExplainer_viz.ipynb
README.md		README.md
best_hyperparameters.json		best_hyperparameters.json
graph_hyper_tune.py		graph_hyper_tune.py
ml_best_hyperparameters.pkl		ml_best_hyperparameters.pkl
model.py		model.py
model_graph_merge_hyper.py		model_graph_merge_hyper.py
model_ml_construct.py		model_ml_construct.py
molecular_representation.py		molecular_representation.py
utils.py		utils.py
workflow.png		workflow.png

lxiongZ/ERRalpha-Predictor

Folders and files

Latest commit

History

Repository files navigation

ERRalpha-Predictor

Overview:

Conda environments:

Experiments:

ML models

GNN models

Ensemble models

Usage:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages