Skip to content

Official implementation of the paper "Curriculum Learning for Language-guided, Multi-modal Detection of Various Pathologies"

Notifications You must be signed in to change notification settings

CCI-Bonn/CL4OD

 
 

Repository files navigation

Overview

Curriculum Learning for Language-guided, Multi-modal Detection of Various Pathologies

Laurenz Adrian Heidrich, Aditya Rastogi, Priyank Upadhya, Gianluca Brugnara, Martha Foltyn-Dumitru, Benedikt Wiestler, Philipp Vollmuth

OpenReview


📝 Overview

This repository contains the official implementation of the methods described in the paper. For the most seamless experience it contains a copy of the entire mmdetection framework and extends it with curriculum learning functionalities. If you already use mmdetection, you can just pick the config files and the code extensions (exact filenames, see below) and add them to your mmdetection version.

✅ Project Checklist

🔧 Features and Extensions

  • Extensions to MMDetection:
    • Curriculum learning integration (check out mmdetection/mmdet/datasets/dataset_wrappers.py and mmdetection/mmdet/engine/runner/loops.py)
    • Support for multiple datasets in curriculum learning

📁 Configuration

  • MMDetection config files used for training

🚀 Examples and Notebooks

  • Minimal example scripts for:
    • Training
    • Testing
  • Jupyter notebook to generate difficulty scores based on minimal object size

📦 Data

  • Full datasets
  • Dataset generation scripts (please refer to COCO and FiftyOne documentation)

Keywords: Medical Image Analysis, Deep Learning, Tumor Detection, Curriculum Learning


📚 Abstract

Pathology detection in medical imaging is crucial for radiologists, but current approaches that train specialized models for each region of interest often lack efficiency and robustness. This repository provides a novel language-guided object detection pipeline for medical imaging, leveraging curriculum learning strategies to progressively train models on increasingly complex samples. Our unified pipeline converts segmentation datasets into bounding box annotations and applies two curriculum learning approaches—teacher curriculum and bounding box size curriculum—to train a Grounding DINO model. Evaluations on diverse tumor types in MRI and CT scans demonstrate significant improvements in detection accuracy, with curriculum learning yielding up to 5.2% AP increase over baseline models.


✨ Features

  • Unified pipeline for converting segmentation datasets to bounding box annotations
  • Language-guided detection using Grounding DINO
  • Curriculum learning strategies:
    • Teacher curriculum
    • Bounding box size curriculum
  • Supports multi-modal medical imaging (MRI, CT)
  • Significant improvements in generalization and detection accuracy across pathologies

🚀 Getting Started

Prerequisites

This project follows the MMDetection framework's official installation instructions with specific requirements.

1. Create and activate the Conda environment

conda create –name openmmlab python=3.8 -y
conda activate openmmlab

2. Install PyTorch

conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia

⚙️ Installation

3. Install core dependencies

pip install -U openmim
pip install fsspec
mim install mmengine
mim install “mmcv==2.1.0”

4. Clone this repository

git clone https://github.com/laurenzheidrich/LangBoxMed.git
cd LangBoxMed/mmdetection/
pip install -v -e .

✅ Verification

6. Download a test model and config

mim download mmdet –config rtmdet_tiny_8xb32-300e_coco –dest .

7. Run an inference demo to verify the installation

python demo/image_demo.py demo/demo.jpg rtmdet_tiny_8xb32-300e_coco.py 
–weights rtmdet_tiny_8xb32-300e_coco_20220902_112414-78e30dcc.pth 
–device cpu

📦 Additional Libraries

8. Install additional required libraries

pip install fairscale
pip install transformers

🗃️ Use the repository

9. Run Inference with trained model

First download the weights from Google Drive

Then run inference using

!python mmdetection/demo/image_demo.py demo_image.png \
        mmdetection/configs/mm_grounding_dino/grounding_dino_swin-t_finetune_ts_multiple_curriculum.py \
        --weights model_teacher.pth \
        --texts 'glioma . brain metastasis . kidney tumor . liver tumor' -c

10. Train a new model

! ./mmdetection/tools/dist_train.sh mmdetection/configs/mm_grounding_dino/sample_config_file.py 2

Feel free to check out the config files grounding_dino_swin-t_finetune_ts_multiple.py and grounding_dino_swin-t_finetune_ts_multiple_curriculum.py to understand how to use multiple datasets, perform undersampling, curriculum learning as well as undersampling during concatenation.

About

Official implementation of the paper "Curriculum Learning for Language-guided, Multi-modal Detection of Various Pathologies"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published