A deep learning framework for semantic segmentation of resistance spot welding (RSW) nuggets in industrial imaging applications. This project implements and compares multiple state-of-the-art segmentation architectures for automated quality control in spot welding processes.
This repository contains a complete machine learning pipeline for resistance spot welding segmentation, developed as part of research into automated quality control systems. The framework supports multiple deep learning architectures and provides tools for training, evaluation, and deployment in high-performance computing environments.
The framework implements five segmentation architectures:
- UNet: Classic encoder-decoder architecture with skip connections
- UNet++: Enhanced UNet with nested dense skip pathways
- DeepLabV3+: Atrous spatial pyramid pooling with encoder-decoder structure
- SegFormer: Vision transformer-based segmentation model
- MiniUNet: Lightweight variant optimized for resource-constrained environments
Each model supports configurable encoders (ResNet18, ResNet34, etc.) and can be trained with or without pretrained weights.
rsw/
├── src/ # Source code
│ ├── main.py # Main training script
│ ├── train.py # Training loop implementation
│ ├── test.py # Model evaluation
│ ├── PrepareData.py # Data loading and preprocessing
│ ├── DataProcessing.py # Data transformation pipeline
│ ├── BatchGenerator.py # Batch generation with augmentation
│ ├── model_utils.py # Model utilities and I/O
│ ├── metrics.py # Evaluation metrics
│ ├── DatasetCreation.py # Dataset preparation tools
│ ├── configs/ # Model configurations
│ │ ├── *.json # Standard training configs
│ │ └── prelim/ # Quick training configs
│ ├── scripts/ # HPC job submission scripts
│ │ ├── submit_*.sh # Model-specific submission scripts
│ └── run_job.sh # Unified SLURM job script
├── data/ # Data directory (excluded from git)
├── models/ # Trained model checkpoints (excluded from git)
├── cluster_printouts/ # HPC job logs (excluded from git)
├── pyproject.toml # Modern Python project configuration
├── requirements.txt # Legacy requirements (HPC-specific)
└── README.md # This file
The framework expects preprocessed data in numpy array format with the following structure:
data/
├── all_images_data_EUR.npy # European dataset
└── all_images_data_lab.npy # Laboratory dataset
Each data entry contains:
[0]Original RSW image[1]Mask-overlayed image[2]Image filename[3]Mask filename[4]Dataset path[5]Polygon annotation points[6]Class ID[7]Annotation state[8]Image dimensions (height, width)
The preprocessing pipeline includes:
- Image resizing to configurable dimensions (default: 512x512)
- Normalization and standardization
- Data augmentation (rotation, translation, noise injection)
- Train/validation/test splitting with stratification
- Cross-validation fold generation
Model behavior is controlled through JSON configuration files in src/configs/. Each model has separate configurations for standard and preliminary (quick) training:
{
"model_type": "unet",
"dataset": "EUR",
"train_batch_size": 64,
"num_epochs": 100,
"lr": 1e-4,
"model_enc": "resnet34",
"resize_dim": 512,
"augment_factor": 3
}Key configuration parameters:
- Model settings: Architecture, encoder, pretrained weights
- Training parameters: Batch size, epochs, learning rate, optimizer settings
- Data processing: Augmentation, resize dimensions, noise parameters
- Evaluation: Binary threshold, test percentage, cross-validation folds
# Clone the repository
git clone <repository-url>
cd rsw-segmentation
# Install with pip (recommended)
pip install -e .
# Or install dependencies only
pip install -r requirements.txt# Install with development dependencies
pip install -e ".[dev]"
# Install additional HPC dependencies if needed
pip install -e ".[hpc]"Set the Comet ML API key for experiment tracking:
export COMET_API_KEY="your-comet-api-key"Basic model training:
cd src
python main.py <model_name> <preliminary> <use_pretrained>Parameters:
model_name: unet, unetplusplus, deeplabv3plus, segformer, miniunetpreliminary: 0 (full training) or 1 (quick training)use_pretrained: 0 (from scratch) or 1 (pretrained encoder)
Examples:
# Train UNet with pretrained encoder
python main.py unet 0 1
# Quick training of SegFormer from scratch
python main.py segformer 1 0The framework includes optimized scripts for SLURM-based HPC clusters:
cd src
# Submit single job
sbatch run_job.sh unet 0 1
# Use convenience scripts
./scripts/submit_unet.sh 0 1
./scripts/submit_segformer.sh 1 1The unified job script (run_job.sh) provides:
- Automatic parameter validation
- Resource allocation (GPU, memory, time limits)
- Module loading for HPC environments
- Comprehensive logging and error handling
Training progress is tracked through:
- Comet ML: Experiment logging with metrics, hyperparameters, and artifacts
- Local logs: Console output and SLURM job files
- Model checkpoints: Automatic saving during training
- Metrics files: JSON-formatted evaluation results
Evaluation metrics include:
- Intersection over Union (IoU)
- Dice coefficient
- Pixel accuracy
- Binary classification metrics
Trained models are saved in models/ with the following structure:
- Model state dictionaries (
.ptfiles) - Training metrics (JSON format)
- Model visualizations (computational graphs)
- Experiment configurations
Test predictions are organized by model and timestamp:
preds_<ModelName>_<Timestamp>/
├── predictions/ # Segmentation masks
├── overlays/ # Prediction overlays
├── metrics.json # Evaluation metrics
└── config.json # Model configuration
All models are implemented using:
- segmentation-models-pytorch for UNet variants and DeepLabV3+
- transformers library for SegFormer
- timm for backbone encoders
- PyTorch as the core framework
- Mixed precision training support for faster training
- Learning rate scheduling with ReduceLROnPlateau
- Early stopping based on validation metrics
- Data augmentation with configurable parameters
- Cross-validation support for robust evaluation
The research codebase provides several customization options to accommodate different hardware specifications while enabling optimal reproducibility of results on similar datasets. Note that the data we used is proprietary and thus cannot be disclosed.
- Efficient data loading with PyTorch DataLoader
- GPU memory optimization for large batch sizes
- Checkpoint saving for training interruption recovery
- Configurable batch sizes adaptable to different hardware specifications
This project is licensed under the MIT License. See the LICENSE file for details.
If you use this code in your research, please cite:
@inproceedings{behnenComparisonDeepLearning2025,
title = {Comparison of Deep Learning Architectures in Ultrasonic Quality Control for Resistance Spot Welding Using Semantic Segmentation},
booktitle = {Production at the Leading Edge of Technology},
author = {Behnen, Lukas and Baacke, Hendrik and Keuper, Alexander and Riesener, Michael and Schuh, Günther and Scott, Ryan and Chertov, Andriy M. and Maev, Roman Gr.},
date = {2025},
pages = {301--308},
publisher = {Springer Nature Switzerland},
doi = {10.1007/978-3-031-86893-1_33},
url = {https://link.springer.com/10.1007/978-3-031-86893-1_33},
}For questions, please contact hendrik.baacke@rwth-aachen.de.