🎯 YOLOv8 Product Detection

Production-ready Object Detection system using YOLOv8 for detecting retail products in images.

📋 Table of Contents

Problem Statement
Solution Overview
Project Structure
System Architecture
Installation
Dataset Preparation
Training
Evaluation
Inference
Model Export
Docker Deployment
Results
Future Improvements

🎯 Problem Statement

Business Context

Retail and e-commerce companies need automated systems to:

Inventory Management: Automatically count and categorize products
Quality Assurance: Verify product placement and arrangement
Smart Checkout: Enable cashier-less checkout systems
Shelf Analytics: Monitor product availability in real-time

Technical Challenge

Develop an object detection system that can:

Detect 3 product categories based on size (small, medium, large)
Achieve high accuracy (mAP50 > 0.8) suitable for production
Run inference in real-time (< 50ms per image)
Export to optimized formats (ONNX, TorchScript) for deployment

💡 Solution Overview

Approach

Model: YOLOv8n (Nano) - Fastest variant with excellent accuracy tradeoff
Architecture: CSPDarknet backbone + PANet neck + Decoupled head
Training: Transfer learning from COCO pretrained weights
Optimization: Adam optimizer with cosine annealing LR schedule

Key Features

✅ Modular, production-ready codebase
✅ Comprehensive dataset pipeline (download → convert → split)
✅ Configurable hyperparameters via CLI
✅ Visualization tools (confusion matrix, PR curves)
✅ Multi-format export (ONNX, TorchScript)
✅ Docker containerization for deployment

📁 Project Structure

product-detection-yolov8/
├── 📄 README.md                 # This documentation
├── 📄 requirements.txt          # Python dependencies
├── 🐳 Dockerfile               # Container configuration
│
├── 📂 src/                     # Source code
│   ├── __init__.py             # Package initialization
│   ├── utils.py                # Utility functions
│   ├── download_dataset.py     # Dataset downloader
│   ├── convert_to_yolo.py      # Annotation converter
│   ├── split_dataset.py        # Train/val splitter
│   ├── train.py                # Training pipeline
│   ├── evaluate.py             # Evaluation & visualization
│   └── inference.py            # Inference engine
│
├── 📂 data/                    # Dataset directory (generated)
│   ├── raw/                    # Raw downloaded data
│   ├── converted/              # YOLO format data
│   ├── train/                  # Training set
│   │   ├── images/
│   │   └── labels/
│   ├── val/                    # Validation set
│   │   ├── images/
│   │   └── labels/
│   └── data.yaml               # Dataset configuration
│
├── 📂 runs/                    # Training runs (generated)
│   └── train_YYYYMMDD_HHMMSS/
│       ├── weights/
│       │   ├── best.pt
│       │   └── last.pt
│       └── results.csv
│
├── 📂 output/                  # Evaluation & inference results (generated)
│   ├── confusion_matrix.png
│   ├── pr_curve.png
│   └── predictions/
│
└── 📂 logs/                    # Log files (generated)
    └── yolov8_YYYYMMDD.log

🏗️ System Architecture

┌─────────────────────────────────────────────────────────────────┐
│                     DATA PIPELINE                                │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐       │
│  │  HuggingFace │    │   Convert    │    │    Split     │       │
│  │   Dataset    │───▶│  to YOLO     │───▶│  Train/Val   │       │
│  │   Download   │    │   Format     │    │   (80/20)    │       │
│  └──────────────┘    └──────────────┘    └──────────────┘       │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                    TRAINING PIPELINE                             │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐       │
│  │   YOLOv8n    │    │    Train     │    │   Export     │       │
│  │  Pretrained  │───▶│   20 epochs  │───▶│  ONNX/TS     │       │
│  │   Weights    │    │   Batch 16   │    │              │       │
│  └──────────────┘    └──────────────┘    └──────────────┘       │
│                                                                  │
│  Hyperparameters:                                                │
│  • Optimizer: Adam (lr=0.001)                                    │
│  • Image Size: 640x640                                           │
│  • Early Stopping: patience=5                                    │
│  • LR Scheduler: Cosine Annealing                                │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                   INFERENCE PIPELINE                             │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌──────────────┐    ┌──────────────┐    ┌──────────────┐       │
│  │    Input     │    │   YOLOv8     │    │   Output     │       │
│  │    Image     │───▶│  Inference   │───▶│  Detections  │       │
│  │              │    │  best.pt     │    │  + Viz       │       │
│  └──────────────┘    └──────────────┘    └──────────────┘       │
│                                                                  │
│  Supported Sources:                                              │
│  • Single image (jpg, png)                                       │
│  • Folder of images                                              │
│  • Webcam (real-time)                                            │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

🚀 Installation

Prerequisites

Python 3.10+
CUDA 11.8+ (for GPU training)
8GB+ RAM
10GB+ disk space

Setup

# Clone repository
cd product-detection-yolov8

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Linux/macOS
# or: .\venv\Scripts\activate  # Windows

# Install dependencies
pip install -r requirements.txt

Verify Installation

python -c "from ultralytics import YOLO; print('YOLOv8 OK')"
python -c "import torch; print(f'PyTorch OK, CUDA: {torch.cuda.is_available()}')"

📦 Dataset Preparation

Option 1: Download from HuggingFace

python src/download_dataset.py --dataset sku110k --output data/raw

Supported datasets:

sku110k - Retail store shelf images
grocery - Grocery store products
cppe-5 - Personal protective equipment

Option 2: Use Custom Dataset

Place your data in YOLO format:

data/custom/
├── images/
│   ├── img001.jpg
│   └── img002.jpg
└── labels/
    ├── img001.txt  # class_id x_center y_center width height (normalized)
    └── img002.txt

Convert Annotations (if needed)

# From COCO format
python src/convert_to_yolo.py --input data/raw --format coco

# From Pascal VOC format
python src/convert_to_yolo.py --input data/raw --format voc

Split Dataset

python src/split_dataset.py --input data/converted --train-ratio 0.8

🏋️ Training

Basic Training

python src/train.py

Custom Configuration

python src/train.py \
    --epochs 50 \
    --batch 32 \
    --imgsz 640 \
    --model yolov8s.pt \
    --optimizer Adam \
    --lr 0.001 \
    --patience 10

Training Options

Parameter	Default	Description
`--model`	yolov8n.pt	Pretrained model (n/s/m/l/x)
`--epochs`	20	Training epochs
`--batch`	16	Batch size
`--imgsz`	640	Image size
`--optimizer`	Adam	Optimizer (Adam/SGD/AdamW)
`--lr`	0.001	Initial learning rate
`--patience`	5	Early stopping patience
`--export`	None	Export format (onnx/torchscript)

Resume Training

python src/train.py --resume runs/train_latest/weights/last.pt

📊 Evaluation

Run Evaluation

python src/evaluate.py --model best.pt

Custom Evaluation

python src/evaluate.py \
    --model runs/train_20240101_120000/weights/best.pt \
    --data data/data.yaml \
    --conf 0.25 \
    --iou 0.45

Generated Outputs

File	Description
`confusion_matrix.png`	Class-wise prediction accuracy
`pr_curve.png`	Precision-Recall curve
`f1_curve.png`	F1 score vs confidence
`class_performance.png`	Per-class metrics bar chart
`metrics.json`	Numeric metrics (mAP, precision, recall)

🔍 Inference

Single Image

python src/inference.py --source image.jpg

Folder of Images

python src/inference.py --source ./images/ --conf 0.5

Webcam (Real-time)

python src/inference.py --source webcam --show

Inference Options

Parameter	Default	Description
`--source`	(required)	Image path, folder, or 'webcam'
`--model`	best.pt	Trained model path
`--conf`	0.25	Confidence threshold
`--iou`	0.45	NMS IOU threshold
`--imgsz`	640	Inference image size
`--show`	False	Display results in window
`--no-save`	False	Don't save output images

📤 Model Export

Export during Training

python src/train.py --export onnx,torchscript

Export Existing Model

from ultralytics import YOLO

model = YOLO('best.pt')

# ONNX (for inference servers)
model.export(format='onnx', dynamic=True, simplify=True)

# TorchScript (for C++ inference)
model.export(format='torchscript')

# TensorRT (NVIDIA GPUs)
model.export(format='engine', device=0)

# CoreML (Apple devices)
model.export(format='coreml')

Export Formats

Format	File	Use Case
ONNX	best.onnx	Cross-platform deployment
TorchScript	best.torchscript	PyTorch production
TensorRT	best.engine	NVIDIA GPU optimization
CoreML	best.mlmodel	iOS/macOS apps
OpenVINO	best_openvino/	Intel hardware

🐳 Docker Deployment

Build Image

# CPU version
docker build -t yolov8-product-detection --target production .

# GPU version (requires NVIDIA Container Toolkit)
docker build -t yolov8-product-detection:gpu --target gpu .

Run Training

docker run --gpus all \
    -v $(pwd)/data:/app/data \
    -v $(pwd)/runs:/app/runs \
    yolov8-product-detection:gpu \
    python src/train.py --epochs 20

Run Inference

docker run --gpus all \
    -v $(pwd)/test_images:/app/images \
    -v $(pwd)/results:/app/output \
    yolov8-product-detection:gpu \
    python src/inference.py --source /app/images

Docker Compose (Optional)

version: '3.8'
services:
  training:
    build:
      context: .
      target: gpu
    volumes:
      - ./data:/app/data
      - ./runs:/app/runs
    command: python src/train.py
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

📈 Results

Training Results (Example)

Metric	Value
mAP50	0.85
mAP50-95	0.62
Precision	0.88
Recall	0.82
Inference Time	12ms (RTX 3080)

Class Performance

Class	Precision	Recall	mAP50
product_small	0.82	0.78	0.80
product_medium	0.90	0.85	0.88
product_large	0.92	0.83	0.87

Training Curves

The training process generates loss curves and metric plots in the runs/ directory:

results.png - Training & validation loss curves
confusion_matrix.png - Validation confusion matrix
P_curve.png - Precision curve
R_curve.png - Recall curve
PR_curve.png - Precision-Recall curve

🔮 Future Improvements

Short-term

Add data augmentation pipeline (Albumentations)
Implement hyperparameter optimization (Optuna)
Add model ensemble support
Create REST API endpoint (FastAPI)

Medium-term

Implement real-time video processing
Add tracking (ByteTrack, BoT-SORT)
Support for instance segmentation
Mobile deployment (TFLite, CoreML)

Long-term

Active learning pipeline
Multi-camera support
Edge deployment (Jetson, Raspberry Pi)
Integration with inventory management systems

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

📧 Contact

For questions or support, please open an issue on GitHub.

Happy Detecting! 🎯

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
dataset		dataset
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🎯 YOLOv8 Product Detection

📋 Table of Contents

🎯 Problem Statement

Business Context

Technical Challenge

💡 Solution Overview

Approach

Key Features

📁 Project Structure

🏗️ System Architecture

🚀 Installation

Prerequisites

Setup

Verify Installation

📦 Dataset Preparation

Option 1: Download from HuggingFace

Option 2: Use Custom Dataset

Convert Annotations (if needed)

Split Dataset

🏋️ Training

Basic Training

Custom Configuration

Training Options

Resume Training

📊 Evaluation

Run Evaluation

Custom Evaluation

Generated Outputs

🔍 Inference

Single Image

Folder of Images

Webcam (Real-time)

Inference Options

📤 Model Export

Export during Training

Export Existing Model

Export Formats

🐳 Docker Deployment

Build Image

Run Training

Run Inference

Docker Compose (Optional)

📈 Results

Training Results (Example)

Class Performance

Training Curves

🔮 Future Improvements

Short-term

Medium-term

Long-term

📝 License

🙏 Acknowledgments

📧 Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages