This is the code for our paper titled "Weakly-Supervised Deep Learning Model for Prostate Cancer Diagnosis and Gleason Grading of Histopathology Images".
- Pandas == 1.3.5
- NumPy == 1.21.6
- skimage == 0.18.3
- tiatoolbox == 1.0.1
- opencv == 4.5.5
- Tensorflow == 2.3
- sklearn == 1.0.2
- tqdm == 4.64.0
- scipy == 1.7.3
- Pytorch == 1.12.1
Running the code has 6 steps which are as follows:
- Step 1: Cropping the WSIs to small patches.
- The following command can be used to run the code for this step:
--mode: for this step model should becrop--wsi_dir: the directory of the WSIs, default = 'images/'--labels: the directory of the clinical file containing the scores. It should be a csv file containing a column namedimage_idfor the name of images, a column namedprimary_gleason_grade, and another column namedsecondary_gleason_gradefor the primary and secondary Gleason grades. default = 'clinical.csv'--patch_size: size of small patches. default = 512- Note: this command saves the patches and labels in
Preprocessed_data//patchesandPreprocessed_data//labels, respectively.
- The following command can be used to run the code for this step:
- Step 2: Train the Autoencoder.
- The following commands can be used to train the autoencoder:
--mode: for this step model should beauto--n_bag: number of bags to train the autoencoder. default = 500--lr_a: learning rate for autoencoder. default = 1e-3--batch_a: batch size for autoencoder. default = 4--epoch_a: number of epochs to train autoencoder. default = 50--auto_dir: path to save autoencoder. default = autoencoder//- Note: this command reads the patches from
Preprocessed_data//patches.
- The following commands can be used to train the autoencoder:
- Step 3: Extract the feature vectors from small patches using the encoder.
- The following commands can be used to extract the features:
--mode: for this step model should befeature--auto_dir: path to the autoencoder. default = autoencoder//- Note: this command reads the patches from
Preprocessed_data//patchesand saves the features inPreprocessed_data//features.
- The following commands can be used to extract the features:
- Step 4: Train the first stage of the model and extract the scores.
- The following commands should be used for this step:
--mode: for this step model should bemodel1--state1: it can be eithertraintestto train and save the scores ortestto load the model and extract the scores. default = traintest- Note: the model will be saved/loaded from
Models//model1
- Note: the model will be saved/loaded from
--score_dir: the path to save the scores. default =Preprocessed_data//scores--epoch: number of epochs. default = 1000--lr: learning rate. default = 5e-5--wd: weight decay. default = 10e-6--gate: whether using gated attention or not. default = True--hid_dim: hidden dimension in the linear layer. default = 512--out_dim: the output dim in the linear layer. default = 256.--n_class: number of class in the data. default = 5.- Note: this command reads the features from
Preprocessed_data//featuresand labels fromPreprocessed_data//labels.
- The following commands should be used for this step:
- Step 5: Create the graphs from the discriminative patches.
- The following commands are for this step:
--mode: for this step model should begraph--scores_dir: path to the scores extracted by the first stage of the model. default = Preprocessed_data//scores//--n_neighbor: number of neighbors in KNN. default = 10--top: choose the top% of high score. default = 40- Note: the graphs will be saved in
Preprocessed_data//graphs.
- The following commands are for this step:
- Step 6: Train the graph convolutional neural network.
- The following commands are used for this step:
--mode: for this step model should bemodel2--epoch: number of epochs. default = 1000--lr: learning rate. default = 5e-5--wd: weight decay. default = 10e-6--GAT: whether using GAT or GCN. default = False--hid_dim: hidden dimension in the linear layer. default = 512--out_dim: the output dims in the linear layer. default = 256.--n_class: number of classes in the data. default = 5.--n_heads: number of attention layers. default = 6.--gat_heads: number of gated attention layers. default = 4.--fold: number of folds in the cross-validation. default = 10.--dense_dim: dim of the last linear layer. default = 16.- Note: this command reads the graph from
Preprocessed_data//graphsand labels fromPreprocessed_data//labels.
- The following commands are used for this step:
For step 1:
python main.py --mode crop --wsi_dir images// --labels clinical.csv --patch_size 512For step 2:
python main.py --mode auto --n_bag 200 --epoch_a 25For step 3:
python main.py --mode feature --auto_dir Autoencoder//For step 4:
python main.py --mode model1 --state1 traintest --epoch 2500For step 5:
python main.py --mode graph --n_neighbor 15 --top 50For step 6:
python main.py --mode model2 --lr 1e-5 --wd 1e-6 --GAT False --fold 5 --n_heads 4@misc{behzadi2022weaklysupervised,
title={Weakly-Supervised Deep Learning Model for Prostate Cancer Diagnosis and Gleason Grading of Histopathology Images},
author={Mohammad Mahdi Behzadi and Mohammad Madani and Hanzhang Wang and Jun Bai and Ankit Bhardwaj and Anna Tarakanova and Harold Yamase and Ga Hie Nam and Sheida Nabavi},
year={2022},
eprint={2212.12844},
archivePrefix={arXiv},
primaryClass={eess.IV}
}