Skip to content

An Active Learning Pipeline Based on Data-Centric AI for Biomedical Image Instance Segmentation

License

Notifications You must be signed in to change notification settings

MMV-Lab/AL_BioMed_img_seg

Repository files navigation

AL_BioMed_img_seg

An Active Learning Pipeline Based on Data-Centric AI for Biomedical Image Instance Segmentation

This repository contains the code and instructions for the AL_BioMed_img_seg project, which has been accepted as a poster at the BVM 2025 Conference. More details about the conference can be found here.

Pipeline

Pipeline


Clone the Repository

git clone https://github.com/MMV-Lab/AL_BioMed_img_seg
cd AL_BioMed_img_seg

Create and Activate Environment

conda create -n AL_BioMed_img_seg python=3.10 -y
conda activate AL_BioMed_img_seg

Install PyTorch with CUDA

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Test GPU Availability

python -c "import torch; print('CUDA is available:', torch.cuda.is_available())"

Install Additional Dependencies

pip install tqdm tensorboard monai timm==0.4.5 tifffile scikit-image opencv-python-headless matplotlib dask-image scikit-learn git+https://github.com/vanvalenlab/cellSAM.git aicsimageio

Alternatively, you can install via requirements.txt:

pip install -r requirements.txt

Data Preparation

3D MitoEM Challenge: Large-scale 3D Mitochondria Instance Segmentation

Dataset: 3D MitoEM Challenge

mkdir -p data/MitoEM_3D
cd data
chmod +x download_and_unzip_MitoEM_3D.sh
./download_and_unzip_MitoEM_3D.sh /Your/path/to/MitoEM_3D
cd ..

MAE Feature Extraction

2D Patches Training

cd MAE
bash pretrain_EM30_R_2D_512_0001.sh /path/to/your/dataset /path/to/output
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 python3 -m torch.distributed.launch \
--nproc_per_node=8 \
--master_port=25678 main_pretrain.py \
--data_path /path/to/input \
--batch_size 8 \
--model mae_vit_base_patch16 \
--norm_pix_loss \
--mask_ratio 0.75 \
--epochs 400 \
--warmup_epochs 40 \
--blr 1e-3 \
--weight_decay 0.05 \
--accum_iter 4 \
--input_size 512 \
--img_size 512 \
--output_dir /path/to/output > /path/to/log/pretrain_log.txt 2>&1

3D Patched Data Preparation

python ./data/crop_3D_tiff_32_512_512_for_trainset.py \
--raw_inputdir /path/to/raw/input \
--raw_outputdir /path/to/raw/output \
--label_inputdir_train /path/to/label/train \
--label_inputdir_val /path/to/label/val \
--label_outputdir /path/to/label/output

3D Patched Feature Extraction

CUDA_VISIBLE_DEVICES=7 python image_feature_extraction.py \
--dataset_dir /path/to/dataset \
--output_dir /path/to/output \
--ckpt_dir /path/to/checkpoint

Coreset Selection

CUDA_VISIBLE_DEVICES=7 python coreset_select.py \
--dataset_dir /path/to/dataset \
--dataset_feature_dir /path/to/dataset_feature \
--core_set_select_ratio 0.5 \
--output_dir /path/to/output

nnUNet

Refer to the nnUNet repository for automated data preparation, training, and prediction.


Post-processing from Semantic to Instance Segmentation

python postprocess_semanti_to_instance.py --semantic_pred /path/to/semantic_pred

Generate Pseudo Labels based on CellSAM

cellSAM Repository: cellSAM

CUDA_VISIBLE_DEVICES=6 python get_pseudo_label_CellSAM.py
conda activate zs_BIBM
screen -S ZS_BVM_2025
CUDA_VISIBLE_DEVICES=7 python ./CellSAM/get_pseudo_label_CellSAM.py \
--input_2D_raw_folder /path/to/raw \
--output_folder /path/to/output

Citation

If you use this code or data in your research, please cite our work presented at BVM 2025:

Zhao, S., Zhou, Y., Chen, J. (2025). Active Learning Pipeline for Biomedical Image Instance Segmentation with Minimal Human Intervention. In: Palm, C., et al. Bildverarbeitung für die Medizin 2025. BVM 2025. Informatik aktuell. Springer Vieweg, Wiesbaden. https://doi.org/10.1007/978-3-658-47422-5_48

BibTeX

@inproceedings{zhao2025active,
  author    = {Shuo Zhao and Ye Zhou and Jianxu Chen},
  title     = {Active Learning Pipeline for Biomedical Image Instance Segmentation with Minimal Human Intervention},
  booktitle = {Bildverarbeitung f{\"u}r die Medizin 2025 (BVM 2025)},
  editor    = {Christian Palm and others},
  series    = {Informatik aktuell},
  publisher = {Springer Vieweg},
  address   = {Wiesbaden},
  year      = {2025},
  doi       = {10.1007/978-3-658-47422-5_48}
}

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

An Active Learning Pipeline Based on Data-Centric AI for Biomedical Image Instance Segmentation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors