A modular framework for training and evaluating deep learning models on image segmentation tasks. While it does not offer fine-grained solutions out of the box, it is ideal for learning, rapid prototyping, and validating research ideas. The codebase is well-documented and straightforward to support both learning and easy extension.
This repository provides tools for image segmentation using deep learning. It features a clean, modular codebase with support for common segmentation architectures and standard training workflows.
- Multiple Architectures: U-Net and Attention U-Net implementations
- Modular Design: Organized code structure with separate modules for datasets, networks, and utilities
- Dynamic Configuration: Reflection mechanism for automatic configuration loading and scalability
- Configuration Management: YAML-based hyperparameter configuration
- Training Pipeline: Standard training loop with validation and checkpointing
- Experiment Tracking: Optional Weights & Biases integration
- Model Export: ONNX format support for deployment
-
Clone the repository:
git clone https://github.com/liu-bodong/CV-Lab.git cd CV-Lab -
Install dependencies:
pip install -r requirements.txt
For a consistent environment with GPU support:
-
Clone the repository:
git clone https://github.com/liu-bodong/CV-Lab.git cd CV-Lab -
Using Docker Compose (Recommended):
docker-compose up -d dev docker-compose exec dev bash -
Using Docker directly:
# Build the image docker build -t cv-lab . # Run with GPU support docker run --gpus all -v $(pwd):/app -it cv-lab bash
Use pre-built images without building locally:
-
From GitHub Container Registry:
# Pull the image docker pull ghcr.io/liu-bodong/cv-lab:latest # Clone repository for code and configs git clone https://github.com/liu-bodong/CV-Lab.git cd CV-Lab # Run with pre-built image docker run --gpus all -v $(pwd):/app -it ghcr.io/liu-bodong/cv-lab:latest bash
-
From Docker Hub:
# Pull the image docker pull liubodong/cv-lab:latest # Clone repository for code and configs git clone https://github.com/liu-bodong/CV-Lab.git cd CV-Lab # Run with pre-built image docker run --gpus all -v $(pwd):/app -it liubodong/cv-lab:latest bash
-
Using Docker Compose with pre-built image:
# Modify docker-compose.yml to use pre-built image: services: dev: image: ghcr.io/liu-bodong/cv-lab:latest # Remove the 'build' section
Benefits of pre-built images:
- No build time required
- Consistent environment across different machines
- Faster setup for CI/CD pipelines
- Pre-tested configurations
- Start training:
python train.py --config hyper.yaml
- Base Image: PyTorch 2.7.1 with CUDA 12.8 and cuDNN 9
- GPU Support: Automatic GPU detection and usage
- Volume Mounting: Code, data, and outputs are mounted for persistence
- Development: Interactive development with live code changes
- Pre-built Images: Available on GitHub Container Registry and Docker Hub
- Image Options: Build locally or use pre-built images for faster setup
- Python 3.8+
- PyTorch 1.9.0+
- torchvision 0.10.0+
- CUDA (optional, for GPU acceleration)
- Docker
- Docker Compose
- NVIDIA Docker runtime (for GPU support)
Note: Docker installation includes all Python dependencies and CUDA support automatically.
Each sub-directory functions as a module with specified export components to be called using reflection.
CV-Lab/
├── datasets/ # Dataset classes and data loading utilities
├── networks/ # Model architectures (U-Net, Attention U-Net)
├── utils/ # Training utilities (logging, metrics, loss functions)
├── notebooks/ # Jupyter notebooks for analysis and visualization
├── runs/ # Training outputs and model checkpoints
├── train.py # Main training script
└── hyper.yaml # Configuration file
Implement your dataset class in datasets/ or use the provided datasets as a reference. The framework expects dataset classes to return image-mask pairs.
Edit hyper.yaml to set your parameters. Note that the framework uses a dynamic configuration system implemented by reflection in python. Each sub-module exports classes and methods to be used. Please pay attention to the exact naming of imported components. Refer to the following example for a glimpse of hyper parameter editing:
# Model configuration
model_type: ModelClass
image_size: [256, 256]
input_channels: 3
output_channels: 1
channels: [64, 128, 256, 512] # for frameworks that require specific channel sizes
# Training configuration
batch_size: 16
epochs: 100
init_lr: 0.0003
dataset: DatasetClass
data_dir: ./data/your_datasetLocal Environment:
python train.py --config hyper.yamlDocker Environment:
# Start container and enter interactive shell
docker-compose up -d dev
docker-compose exec dev bash
# Run training inside container
python train.py --config hyper.yamlpython train.py --config hyper.yamlTraining outputs are saved to the runs/ directory:
runs/[model_name]_[timestamp]/
├── metrics.csv # Training metrics
├── best.pth # Best model checkpoint
├── last.pth # Final checkpoint
├── summary.yaml # Run configuration
└── plot.png # Training curves
The Docker setup is optimized for development with live code changes:
# Start development container
docker-compose up -d dev
# Enter container shell
docker-compose exec dev bash
# Your code changes in the host are immediately reflected in the container
# Data and outputs persist in mounted volumes- Code:
.:/app- Live code editing - Data:
./data:/app/data- Persistent dataset storage - Outputs:
./runs:/app/runs- Training results persist on host - Wandb:
./wandb:/app/wandb- Experiment tracking logs
The container automatically detects and uses available GPUs:
deploy:
resources:
reservations:
devices:
- driver: nvidia
capabilities: [gpu]# Stop container
docker-compose down
# Rebuild after requirements.txt changes
docker-compose build dev
# View logs
docker-compose logs devCurrently, the framework supports the following models:
- ResNet50/101
- MobileNetV1
- U-Net
- Attention U-Net
- more models are under development, feel free to add your own too!
The framework includes a BrainMRIDataset class corresponding to a dataset on kaggle as an example implementation. You can create custom dataset classes for your needs.
- Loads image-mask pairs from organized directory structure
- Applies separate transforms for images and masks
- Supports common medical image formats
model_type: Model architectureimage_size: Input dimensions[height, width]input_channels: Number of input channelsoutput_channels: Number of output classeschannels: A list of channel sizes -- channel progression for encoder/decoder
batch_size: Training batch sizeepochs: Number of training epochsinit_lr: Initial learning ratedataset: Dataset class namedata_dir: Path to datasetval_split: Validation split ratio
- Weights & Biases: Set
wandb_project,wandb_entity, andwandb_modefor experiment tracking - Early Stopping: Configure
patiencefor convergence early stopping - Learning Rate Scheduling: Use
lr_rampdown_epochsfor learning rate decay
The notebooks/ directory contains tools for analysis and visualization:
main.ipynb: Model validation and output visualizationplot_csv.ipynb: Generate plots from training metricsexport.ipynb: Convert models to ONNX formatmodel_sanity.ipynb: Architecture testing and debuggingwandb.ipynb: Weights & Biases integration testing
This project is licensed under the MIT License - see the LICENSE file for details.
