Skip to content

BGill8/Cell-Segmentation-Deep-Learning

Repository files navigation

Cell-Segmentation-Deep-Learning

Nuclei Cell Instance Segmentation Project for AI535 Final Project

Team Members:

  • Brandon Gill
  • Andy Bui

Description

Cell segmentation is a critical process for defining boundaries in microscopic images, enabling quantitative analysis of cell counts, shapes, and molecular content. By accurately identifying individual cells, this technology supports drug discovery, disease research, and spatial tissue analysis, ultimately helping to improve cancer diagnosis and treatment strategies.

  • Architecture: U-Net
  • Loss Function: Binary Cross-Entropy (BCE) + Dice Loss
  • Evaluation Metric: Intersection over Union (IoU)
  • Experiment Tracking: Weights & Biases (WandB)

Project Structure

Cell-Segmentation-Deep-Learning/
├── app/
│   └── app.py
├── checkpoints/
│   └── unet_best_model.pth
├── data/
│   ├── augmented/
│   └── data-science-bowl-2018/
│       ├── stage1_test/
│       ├── stage1_train/
│       ├── stage2_test_final/
│       ├── stage1_sample_submission.csv
│       ├── stage1_solution.csv
│       ├── stage1_train_labels.csv
│       └── stage2_sample_submission_final.csv
├── notebooks/
│   └── 01_data_exploration.ipynb
├── outputs/
├── src/
│   ├── __init__.py
│   ├── dataset.py
│   ├── evaluate.py
│   ├── loss.py
│   ├── metrics.py
│   ├── model.py
│   ├── train.py
│   └── utils.py
├── .gitignore
├── README.md
├── requirements.txt
└── train.slurm

Install Dependencies

pip install -r requirements.txt

Data

The data for this project is sourced from the 2018 Data Science Bowl on Kaggle: Data Science Bowl 2018 Data

OSU HPC Cheat Sheet

Running on the OSU HPC Cluster

This project is configured to run on the Oregon State University high-performance computing (HPC) cluster using the SLURM workload manager. Since we are performing instance segmentation, utilizing the cluster's GPU nodes is highly recommended for training.

Note: Instance segmentation requires significantly more computational overhead than semantic segmentation — especially when computing watershed algorithms or running models like Mask R-CNN or StarDist. The HPC cluster's GPU nodes are the right tool for this workload.


1. Connecting & Setup

First, SSH into the cluster (ensure you are on the OSU VPN if off-campus):

ssh your_onid@submit.hpc.engr.oregonstate.edu

Clone the repository into your home directory or scratch space:

cd hpc-share
git clone https://github.com/BGill8/Cell-Segmentation-Deep-Learning.git
cd Cell-Segmentation-Deep-Learning

2. Environment Configuration

Do not install packages directly to your base environment. Load the necessary CUDA and Python modules, then create a virtual environment.

# Load necessary modules (adjust versions based on current HPC availability)
module load python/3.10
module load cuda/11.8
module load slurm

# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install --upgrade pip
pip install -r requirements.txt

3. Data Management

Important: Do not store data or environments in your home directory (~), or you will instantly hit a strict quota limit and crash. Keep everything in your high-capacity ~/hpc-share/ drive.

Since the Kaggle CLI requires API key configuration, the easiest way to get the data onto the cluster is to download the data-science-bowl-2018.zip to your local machine, and securely copy it over:

1. Run this on your LOCAL machine (Mac/Windows), not the cluster:

scp /path/to/your/downloads/data-science-bowl-2018.zip your_onid@submit.hpc.engr.oregonstate.edu:/nfs/stak/users/your_onid/hpc-share/Cell-Segmentation-Deep-Learning/
  1. Extract and organize on the cluster:
cd ~/hpc-share/Cell-Segmentation-Deep-Learning/
mkdir -p data/data-science-bowl-2018
unzip data-science-bowl-2018.zip -d data/data-science-bowl-2018/

# Extract the inner training and testing folders
cd data/data-science-bowl-2018
for f in *.zip; do unzip -d "${f%.zip}" "$f"; done

# CRITICAL: Delete all zip files to free up quota
rm *.zip
cd ../../
rm data-science-bowl-2018.zip

4. Weights & Biases (WandB) Login

Before submitting a job, you must authenticate WandB on the cluster to track instance segmentation metrics (training loss, IoU, mask visualizations, etc.):

wandb login

Paste your API key when prompted.


5. Submitting a Training Job

Never run python src/train.py directly on the login node. Always submit a batch job using SLURM.

Create a file named train.slurm in the project root:

#!/bin/bash
#SBATCH --job-name=nuclei_instance_seg
#SBATCH --partition=dgx2           # Specify the GPU partition (e.g., dgx2 or gpu)
#SBATCH --gres=gpu:1               # Request 1 GPU
#SBATCH --cpus-per-task=4          # Number of CPU cores for data loading
#SBATCH --mem=32G                  # Memory required
#SBATCH --time=12:00:00            # Maximum time limit (hrs:min:sec)
#SBATCH --output=outputs/slurm-%j.out

# Load modules and activate environment
module load python/3.12 cuda/12.8 gcc/12.5
source /nfs/hpc-share/your_onid/envs/vllm/bin/activate

# Run the training script
python src/train.py --data_path /nfs/hpc-share/your_onid/data-science-bowl-2018/stage1_train

Submit the job to the queue:

sbatch train.slurm

6. Monitoring Progress

Method Command / Location
SLURM queue squeue -u your_onid
Live logs tail -f outputs/slurm-<JOB_ID>.out
Metrics dashboard WandB project dashboard (loss, IoU, mask visualizations)

7. Recommended argparse Setup for src/train.py

To cleanly pass the scratch directory path (and other hyperparameters) from your SLURM script via the command line, add the following argument parser to src/train.py:

import argparse

def get_args():
    parser = argparse.ArgumentParser(description="Train nuclei instance segmentation model")

    # Data
    parser.add_argument("--data_path", type=str,
                        default="data/data-science-bowl-2018/stage1_train",
                        help="Path to training data (use scratch path on HPC)")

    # Training hyperparameters
    parser.add_argument("--epochs",       type=int,   default=50)
    parser.add_argument("--batch_size",   type=int,   default=8)
    parser.add_argument("--lr",           type=float, default=1e-4)
    parser.add_argument("--num_workers",  type=int,   default=4,
                        help="DataLoader workers — match --cpus-per-task in SLURM script")

    # Experiment tracking
    parser.add_argument("--wandb_project", type=str, default="cell-segmentation")
    parser.add_argument("--run_name",      type=str, default=None)

    return parser.parse_args()


if __name__ == "__main__":
    args = get_args()
    # e.g., dataset = CellDataset(root=args.data_path)

Then in your SLURM script, you can override any default at submission time:

python src/train.py \
    --data_path /scratch/your_onid/data-science-bowl-2018/stage1_train \
    --epochs 100 \
    --batch_size 16 \
    --run_name "maskrcnn_run1"

Quick Reference Cheatsheet

# Check available modules
module spider python
module spider cuda

# Check job queue
squeue -u your_onid

# Cancel a job
scancel <JOB_ID>

# Check cluster node availability
sinfo -p gpu

# Check your scratch storage usage
du -sh /scratch/your_onid/

About

Semantic Nuclei Cell Segmentation Project for AI535 Final Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors