Self-Supervised Learning with Application for Infant Cerebellum Segmentation and Analysis

Little is known about the early cerebellar development during the first two postnatal years, due to the challenging tissue segmentation caused by tightly-folded cortex, extremely low and dynamic tissue contrast, and large inter-site data heterogeneity. In this study, we propose a self-supervised learning (SSL) framework for infant cerebellum segmentation with multi-domain MR images.

Figure 1. Source domain is 24-month-old subjects with manual labels and the target domain is unlabeled 18-month-old subjects. Our SSL framework consists of two steps. In Step 1, in the source domain, a segmentation model (i.e., ADU-Net) is trained based on a number of training subjects with manual labels, and then applied to testing subjects to automatically generate segmentations. Then, a confidence network is trained to evaluate the reliability of those automated segmentations at the voxel level. In Step 2, a set of reliable training samples from the testing subjects is automatically generated to train a segmentation model for the target domain accordingly, guided by a proposed spatially-weighted cross-entropy loss.

Method

We take advantage of accurate manual labels from 24-month-old cerebellums and transfer them to other time-points. To alleviate the domain shift issue, we propose to automatically generate a set of reliable training samples for target domains. To be specific, we first borrow the manual labels from 24-month-old cerebellums with high contrast (i.e., source domain) to train a segmentation model. Then the segmentation model is directly applied to testing imaging data from the target domain. We further utilize a confidence map to identify the reliability of automated segmentations and generate a set of reliable training samples for the target domain. Lastly, we train a target-domain-specific segmentation model with a spatially-weighted cross-entropy loss function, based on the generated reliable training samples. Experiments on three cohorts and one challenge demonstrate superior performance of our proposed framework.

Data and MRI preprocessing

Data

Public cerebellum data were from three cohorts, i.e., UNC/UMN Baby Connectome Project (BCP), Philips scans collected by Vanderbilt University, and National Database for Autism Research (NDAR). The data of BCP, Philips scans (Vanderbilt U) and NDAR were collected by 3T Siemens scanner, 3T Philips scanner and 3T Siemens scanner, respectively.

BCP: https://nda.nih.gov/edit_collection.html?id=2848/

Philips scans (Vanderbilt U): https://github.com/YueSun814/Philips_data.git/

NDAR (autism): https://nda.nih.gov/edit_collection.html?id=19/
Public cerebrum data were provided by the iSeg-2019 challenge, in which multi-site data were collected by 3T Siemens and 3T GE scanners using different imaging parameters.

iSeg-2019: https://iseg2019.web.unc.edu/

Data analysis

The image preprocessing steps, including skull stripping and extraction of the cerebellum, was performed by a public infant cerebrum-dedicated pipeline (iBEAT V2.0 Cloud, http://www.ibeat.cloud). The network was implemented using the Caffe deep learning framework (Caffe 1.0.0-rc3). The data testing used the custom Python code (Python 2.7.17).

Training and Testing

File descriptions

Training_subjects

18 T1w MRIs with corresponding manual labels.

subject-x-T1w.nii.gz: the T1w MRI.

subejct-x-label.nii.gz: the manual label.

Testing_subjects

The subjects in the folder Testing_subjects are only randomly selected examples for the model testing.

subject-x-T1.hdr: the T1w MRI.

subject-x-T2.hdr: the T2w MRI.

Template

Template_T1.hdr: a template for histogram matching of T1w images.

Template_T2.hdr: a template for histogram matching of T2w images.

Dataset

The datasets in the folder Dataset are only randomly selected examples for the model training. Please download them from https://github.com/YueSun814/SSL-dataset firstly.

subject-x-dataset-24m.hdf5: a dataset for the 24-month-old segmentation model.

subject-x-dataset-18m.hdf5: a dataset for the 18-month-old segmentation model.

subject-x-dataset-cp.hdf5: a dataset for the confidence model.

Segmentation_model-24_month

infant_train.prototxt: the ADU-Net structure with a cross-entropy loss.

deploy.prototxt: a duplicate of the train prototxt.

train_dataset.txt: paths of training dataset.

test_dataset.txt: paths of testing dataset.

segtissue.py: the testing file.

Segmentation_model-18_month

infant_train.prototxt: the ADU-Net structure with the proposed spatially-weighted cross-entropy loss.

deploy.prototxt: a duplicate of the train prototxt.

train_dataset.txt: paths of training dataset.

test_dataset.txt: paths of testing dataset.

segtissue.py: the testing file.

Confidence_model

infant_train.prototxt: the network structure with a binary cross-entropy loss.

deploy.prototxt: a duplicate of the train prototxt.

train_dataset.txt: paths of training dataset.

test_dataset.txt: paths of testing dataset.

segtissue.py: the testing file.

Steps:

System requirements:

Ubuntu 18.04.5

Caffe version 1.0.0-rc3

To make sure of consistency with our used version (e.g., including 3d convolution, and WeightedSoftmaxWithLoss, etc.), we strongly recommend installing Caffe using our released caffe_rc3. The installation steps are easy to perform without compilation procedure:

a. Download caffe_rc3 and caffe_lib.

caffe_rc3: https://github.com/YueSun814/caffe_rc3

caffe_lib: https://github.com/YueSun814/caffe_lib

b. Add paths of caffe_lib, and caffe_rc3/python to your ~/.bashrc file. For example, if the folders are saved in the home path, then add the following commands to the ~/.bashrc

export LD_LIBRARY_PATH=~/caffe_lib:$LD_LIBRARY_PATH

export PYTHONPATH=~/caffe_rc3/python:$PATH

c. Test Caffe

cd caffe_rc3/build/tools

./caffe

Then, the screen will show:

Typical install time: few minutes.

In folder: Segmentation_model-24_month

Generate a training dataset (hdf5) for training a 24-month-old segmentation model: the intensity images and corresponding manual labels can be found in folder Training_subejcts.

Some training samples are prepared in the folder Datasets.
Training a segmentation model of 24 months old (SegM-24).

nohup ~/Caffe_rc3/build/tools/caffe train -solver solver.prototxt -gpu x >train.txt 2>&1 & #submit a model training job using gpu x

Expected output: _iter_xxx.caffemodel/_iter_xxx.solverstate

Trained model: SegM_24.caffemodel/SegM_24.solverstate
Using SegM-24 to derive automated segmentations and probability tissue maps for 24-month-old subjects by performing a 2-fold cross-validation.

Performing histogram matching for testing subjects first with templates (in Template). The testing subjects in Testing_subjects have performed histogram matching.

python segtissue.py

Output folders: subject-1/subject-2

Expected run time: about 1~2 minutes per subject.

In folder: Confidence_model

Generate a training dataset (hdf5) for confidence model: computing confidence maps, defined as the differences between the manual labels (in folder Training_subejcts) and the automated segmentations (step 4), is regarded as ground truth to train a confidence network. The automated segmentations and corresponding tissue probability maps (step 4) are used as inputs.

Some training samples are prepared in the folder Datasets.
Training a confidence model (ConM).

nohup ~/Caffe_rc3/build/tools/caffe train -solver solver.prototxt -gpu x >train.txt 2>&1 & #submit a model training job using gpu x

Expected output: _iter_xxx.caffemodel/_iter_xxx.solverstate

Trained model: ConM.caffemodel/ConM.solverstate

In folder: Segmentation_model-24_month

Using SegM-24 to derive automated segmentations and probability tissue maps for 18-month-old subjects.

In folder: Confidence_model

Using ConM to evaluate the reliability of the automated segmentation (step 7) at each voxel.

python segtissue.py

Output folders: subject-1/subject-2

Expected run time: about 1~2 minutes per subject.

In folder: Segmentation_model-18_month

Generate a training dataset (hdf5) for training a 18-month-old segmentation model (Inputs: intensity images and the confidence maps in step 8; target: automated segmentations in step 7).

Some training samples are prepared in the folder Datasets.
Training a segmentation model of 18 months old (SegM-18).

nohup ~/Caffe_rc3/build/tools/caffe train -solver solver.prototxt -gpu x >train.txt 2>&1 & #submit a model training job using gpu x

Expected output: _iter_xxx.caffemodel/_iter_xxx.solverstate

Trained model: SegM_18.caffemodel/SegM_18.solverstate
Testing subjects.

python segtissue.py

Output folders: subject-1/subject-2

Expected run time: about 1~2 minutes per subject.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Self-Supervised Learning with Application for Infant Cerebellum Segmentation and Analysis

Method

Data and MRI preprocessing

Data

Data analysis

Training and Testing

File descriptions

Steps:

License: LICENSE.txt

About

Uh oh!

Releases 1

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Confidence_model		Confidence_model
Dataset		Dataset
Segmentation_model-18_month		Segmentation_model-18_month
Segmentation_model-24_month		Segmentation_model-24_month
Template		Template
Testing_subjects		Testing_subjects
Training_subjects		Training_subjects
.DS_Store		.DS_Store
.gitattributes		.gitattributes
LICENSE.txt		LICENSE.txt
README.md		README.md

License

DBC-Lab/Self_Supervised_Learning

Folders and files

Latest commit

History

Repository files navigation

Self-Supervised Learning with Application for Infant Cerebellum Segmentation and Analysis

Method

Data and MRI preprocessing

Data

Data analysis

Training and Testing

File descriptions

Steps:

License: LICENSE.txt

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 2

Uh oh!

Languages

Packages