Curriculum Learning for Language-guided, Multi-modal Detection of Various Pathologies

Laurenz Adrian Heidrich, Aditya Rastogi, Priyank Upadhya, Gianluca Brugnara, Martha Foltyn-Dumitru, Benedikt Wiestler, Philipp Vollmuth

📝 Overview

This repository contains the official implementation of the methods described in the paper. For the most seamless experience it contains a copy of the entire mmdetection framework and extends it with curriculum learning functionalities. If you already use mmdetection, you can just pick the config files and the code extensions (exact filenames, see below) and add them to your mmdetection version.

✅ Project Checklist

🔧 Features and Extensions

Extensions to MMDetection:
- Curriculum learning integration (check out mmdetection/mmdet/datasets/dataset_wrappers.py and mmdetection/mmdet/engine/runner/loops.py)
- Support for multiple datasets in curriculum learning

📁 Configuration

MMDetection config files used for training

🚀 Examples and Notebooks

Minimal example scripts for:
- Training
- Testing
Jupyter notebook to generate difficulty scores based on minimal object size

📦 Data

Full datasets
Dataset generation scripts (please refer to COCO and FiftyOne documentation)

Keywords: Medical Image Analysis, Deep Learning, Tumor Detection, Curriculum Learning

📚 Abstract

Pathology detection in medical imaging is crucial for radiologists, but current approaches that train specialized models for each region of interest often lack efficiency and robustness. This repository provides a novel language-guided object detection pipeline for medical imaging, leveraging curriculum learning strategies to progressively train models on increasingly complex samples. Our unified pipeline converts segmentation datasets into bounding box annotations and applies two curriculum learning approaches—teacher curriculum and bounding box size curriculum—to train a Grounding DINO model. Evaluations on diverse tumor types in MRI and CT scans demonstrate significant improvements in detection accuracy, with curriculum learning yielding up to 5.2% AP increase over baseline models.

✨ Features

Unified pipeline for converting segmentation datasets to bounding box annotations
Language-guided detection using Grounding DINO
Curriculum learning strategies:
- Teacher curriculum
- Bounding box size curriculum
Supports multi-modal medical imaging (MRI, CT)
Significant improvements in generalization and detection accuracy across pathologies

🚀 Getting Started

Prerequisites

This project follows the MMDetection framework's official installation instructions with specific requirements.

1. Create and activate the Conda environment

conda create –name openmmlab python=3.8 -y
conda activate openmmlab

2. Install PyTorch

conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia

⚙️ Installation

3. Install core dependencies

pip install -U openmim
pip install fsspec
mim install mmengine
mim install “mmcv==2.1.0”

4. Clone this repository

git clone https://github.com/laurenzheidrich/LangBoxMed.git
cd LangBoxMed/mmdetection/
pip install -v -e .

✅ Verification

6. Download a test model and config

mim download mmdet –config rtmdet_tiny_8xb32-300e_coco –dest .

7. Run an inference demo to verify the installation

python demo/image_demo.py demo/demo.jpg rtmdet_tiny_8xb32-300e_coco.py 
–weights rtmdet_tiny_8xb32-300e_coco_20220902_112414-78e30dcc.pth 
–device cpu

📦 Additional Libraries

8. Install additional required libraries

pip install fairscale
pip install transformers

🗃️ Use the repository

9. Run Inference with trained model

First download the weights from Google Drive

Then run inference using

!python mmdetection/demo/image_demo.py demo_image.png \
        mmdetection/configs/mm_grounding_dino/grounding_dino_swin-t_finetune_ts_multiple_curriculum.py \
        --weights model_teacher.pth \
        --texts 'glioma . brain metastasis . kidney tumor . liver tumor' -c

10. Train a new model

! ./mmdetection/tools/dist_train.sh mmdetection/configs/mm_grounding_dino/sample_config_file.py 2

Feel free to check out the config files grounding_dino_swin-t_finetune_ts_multiple.py and grounding_dino_swin-t_finetune_ts_multiple_curriculum.py to understand how to use multiple datasets, perform undersampling, curriculum learning as well as undersampling during concatenation.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
mmdetection		mmdetection
utils		utils
.gitignore		.gitignore
README.md		README.md
curriculum_difficulty_objectsize.ipynb		curriculum_difficulty_objectsize.ipynb
demo_image.png		demo_image.png
overview.png		overview.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Curriculum Learning for Language-guided, Multi-modal Detection of Various Pathologies

📝 Overview

✅ Project Checklist

🔧 Features and Extensions

📁 Configuration

🚀 Examples and Notebooks

📦 Data

📚 Abstract

✨ Features

🚀 Getting Started

Prerequisites

1. Create and activate the Conda environment

2. Install PyTorch

⚙️ Installation

3. Install core dependencies

4. Clone this repository

✅ Verification

6. Download a test model and config

7. Run an inference demo to verify the installation

📦 Additional Libraries

8. Install additional required libraries

🗃️ Use the repository

9. Run Inference with trained model

10. Train a new model

About

Uh oh!

Releases

Packages

CCI-Bonn/CL4OD

Folders and files

Latest commit

History

Repository files navigation

Curriculum Learning for Language-guided, Multi-modal Detection of Various Pathologies

📝 Overview

✅ Project Checklist

🔧 Features and Extensions

📁 Configuration

🚀 Examples and Notebooks

📦 Data

📚 Abstract

✨ Features

🚀 Getting Started

Prerequisites

1. Create and activate the Conda environment

2. Install PyTorch

⚙️ Installation

3. Install core dependencies

4. Clone this repository

✅ Verification

6. Download a test model and config

7. Run an inference demo to verify the installation

📦 Additional Libraries

8. Install additional required libraries

🗃️ Use the repository

9. Run Inference with trained model

10. Train a new model

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages