Lung Nodule Malignancy Classification

Binary classification of lung nodules (Benign / Malignant) from CT scans. Built on the LUNA25 challenge framework — Group 12.

1. Project Structure

lung_nodule_pipeline/
├── lung_nodule/                    # Main Python package
│   ├── config.py                   # Hyperparameters & settings
│   ├── classification/             # 2D / 3D malignancy classifier
│   ├── data/                       # Dataset, patch extraction, augmentation
│   ├── detection/                  # MONAI RetinaNet nodule detector
│   ├── models/                     # Model architectures (ResNet152, UNet3D, ViT, ...)
│   ├── pipeline/                   # End-to-end orchestration + DICOM→NIfTI
│   ├── reporting/                  # Batch report generation
│   └── training/                   # Trainer, loss functions, k-fold splits
│
├── docs/
│   ├── TRAINING.md                 # Data format, training guide, parameters
│   └── INFERENCE.md                # Inference modes, checkpoint setup, MTN guide
│
├── data/                           # Training data (gitignored — download separately)
│   ├── image/                      # Nodule patches: <AnnotationID>.npy
│   ├── metadata/                   # Spatial metadata: <AnnotationID>.npy
│   └── csv/                        # 5-fold split CSVs
│
├── weights/                        # Pre-trained checkpoints (gitignored — download separately)
│   ├── dt_model.ts                 # RetinaNet detection model (TorchScript)
│   ├── ResNet152-confirmed/        # 2D classification ensemble (fold_1..5)
│   └── unet3D_encoder_scse/        # 3D classification ensemble (fold0..4)
│
├── train.py                        # Train 5-fold cross-validation
├── infer.py                        # Classify known nodule coordinates (single / batch CSV)
├── predict.py                      # End-to-end DICOM → detect → classify
├── run_report.py                   # Batch report across a dataset directory
├── infer_mtn.sh                    # One-shot MTN dataset inference → CSV
│
├── setup.py
├── requirements.txt
└── README.md

2. Download Model Weights

Pre-trained weights are hosted on Google Drive (~1.4 GB). Run the download script:

bash download_weights.sh

This installs gdown and downloads the weights folder directly from: https://drive.google.com/drive/folders/1LyVA8gn6EF71iCeVbYkefPp5J1MYxpIR

weights/
├── dt_model.ts                       # RetinaNet detection model (80 MB)
├── ResNet152-confirmed/              # 2D classification ensemble (5 × 233 MB)
│   └── fold_1..5/best_metric_model.pth
└── unet3D_encoder_scse/              # 3D classification ensemble (5 × 55 MB)
    └── best_metric_model_fold0..4.pth

3. Environment Setup

# Create and activate environment
conda create -n lung_nodule python=3.11 -y
conda activate lung_nodule

# Install PyTorch (adjust cu121 to match your CUDA version)
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121

# Install all remaining dependencies
pip install -r requirements.txt

Verify:

python -c "import torch, monai, SimpleITK, timm; print('OK')"

4. Training

See docs/TRAINING.md for the full guide, including:

Data directory layout and CSV format
Generating 5-fold splits
All train.py arguments and available model architectures
Expected output structure and metrics

Quick start:

python train.py \
    --image_dir  ./data \
    --csv_dir    ./data/csv \
    --output_dir ./checkpoints \
    --model      ResNet152 \
    --epochs     200

5. Inference

See docs/INFERENCE.md for the full guide, including:

Checkpoint setup (pre-trained vs. custom)
Mode A — single nodule from known coordinates
Mode B — batch CSV inference
Mode C — MTN dataset: extract ZIPs → detect → classify → CSV (infer_mtn.sh)
Model types: 2D ResNet152 / 3D UNet3D+scSE / both
Coordinate system reference and troubleshooting

Quick start (MTN dataset):

bash infer_mtn.sh /path/to/MTN/ --output_dir ./output/mtn

Single nodule:

python infer.py \
    --ct      patient.nii.gz \
    --coord_x -34.3 \
    --coord_y  44.2 \
    --coord_z -49.3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lung Nodule Malignancy Classification

Table of Contents

1. Project Structure

2. Download Model Weights

3. Environment Setup

4. Training

5. Inference

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
docs		docs
lung_nodule		lung_nodule
.gitignore		.gitignore
README.md		README.md
download_weights.sh		download_weights.sh
infer.py		infer.py
infer_mtn.sh		infer_mtn.sh
predict.py		predict.py
requirements.txt		requirements.txt
run_report.py		run_report.py
setup.py		setup.py
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

Lung Nodule Malignancy Classification

Table of Contents

1. Project Structure

2. Download Model Weights

3. Environment Setup

4. Training

5. Inference

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages