Skip to content

KaidRommel/edge-lpd

Repository files navigation

EdgeLPD

This repository contains a TensorFlow 2.x pipeline for training an ultra-light license plate detector on the CCPD2019 dataset. The project includes data preparation scripts, configurable training/evaluation modules, and dockerized tooling so you can reproduce our experiments or extend them to new deployment targets (e.g., MCUs).

---

✨ Highlights

  • Patch-based training tuned for edge deployments, following the workflow described in docs/data_and_aug.md.
  • Loss design: Binary Focal Loss (extreme FG/BG imbalance) + Masked CIoU (positives-only regression) for stable patch-based training. See docs/loss.md for details.
  • Multiple neck architectures (fpn, sharedneck, s2_neck) and optional ULSAM attention blocks.
  • End-to-end Docker support with GPU passthrough and a one-click script for building/running training containers.
  • Quantization-aware (QAT) and post-training (PTQ) flows bundled into the main training script.
  • MCU‑friendly ops, memory‑aware model sizes, and reference latency.
  • Evaluation tools for both desktop inference (lpd/test.py) and embedded benchmarks.

📦 Repository Layout

├── docker-compose.yml          # Docker stack for training with GPU passthrough
├── Dockerfile                  # TensorFlow 2.12 (GPU) base image + project deps
├── lpd/                        # Core Python package (configs, models, trainers, utils, …)
├── scripts/
│   ├── train_in_docker.sh      # One-click helper to build & launch container training
│   ├── prep_tfrecords.sh       # Resize CCPD images and create TFRecord splits
│   └── misc/tensorboard.md     # Optional TensorBoard setup instructions
├── dataset/                    # Expected location for CCPD2019 + generated TFRecords
├── models/                     # Saved `.h5` checkpoints (created at runtime)
├── logs/                       # TensorBoard logs (created at runtime)
├── results/                    # Visualizations & evaluation artifacts (created at runtime)
└── README.md

⚙️ Prerequisites

Containers (Recommended)

  • Docker 20.10+ with the NVIDIA Container Toolkit for GPU training.
  • Docker Compose v2 (docker compose) or v1 (docker-compose).

Native Execution (Optional)

  • Python ≥ 3.10 (3.11 tested).
  • pip install -r requirements.txt
  • NVIDIA drivers + CUDA-capable GPU for best performance (fallback to CPU works but is slow).

📚 Dataset Preparation

  1. Download the CCPD2019 dataset and unpack it under dataset/CCPD2019/.

  2. Generate resized crops and TFRecords:

    scripts/prep_tfrecords.sh

    This produces:

    • dataset/CCPD_resized/ – canonicalized images for patch sampling.
    • dataset/CCPD_2019_tfrecords/ – the TFRecords used by the training pipeline (dataset_stats.json is expected here).

For a deeper look at the sampling strategy and augmentations, read docs/data_and_aug.md.


🧠 Training

1. One-click Docker workflow

scripts/train_in_docker.sh --help
scripts/train_in_docker.sh --neck sharedneck --qat

What the script does:

  1. Validates Docker/Compose availability and the TFRecords directory.
  2. Optionally rebuilds the image defined in docker-compose.yml.
  3. Launches the edge-lpd container with host UID/GID mapping.
  4. Runs python lpd/train.py inside the container with the options you provide (--neck, --ulsam, --qat, --force-train, plus any extra arguments).

Training artifacts are written to the mounted models/, logs/, cache/, and results/ folders in your workspace.

2. Native (outside Docker)

python -m venv venv
source venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
python lpd/train.py --neck sharedneck --ulsam

Environment variables such as CUDA_VISIBLE_DEVICES can be set to control GPU visibility. The script automatically enables memory growth on detected GPUs.


🧪 Evaluation & Visualization

  • Desktop evaluation of a saved Keras model:

    python lpd/test.py --model_path models/my_detector_best.h5 --conf_thresh 0.5 --iou_thresh 0.5
  • TensorBoard (optional):

    tensorboard --logdir=./logs --port=6006 --bind_all
  • MCU profiling / embedded evaluation: leverage the MCU-OD-Profiler for on-device benchmarks.

Generated visualizations and feature maps are saved under results/ during training runs.


📊 Results (ESP32‑S3 setting)

Detection Results on ESP32‑S3 (conf=0.5, IoU=0.5; regression diagnostics on TPs only)

Model TP FP FN TN Precision Recall F1-Score Accuracy AP@0.50 Mean IoU (TPs) ROC-AUC (cls) PR-AUC (cls) Mean CIoU Norm Center Err Norm Width Err Norm Height Err
sharedneck-standard 32833 2290 67 310 0.9348 0.9980 0.9653 0.9336 0.9096 0.8270 0.3584 0.9099 0.8259 0.0134 0.0662 0.1070
sharedneck-ulsam 32897 2234 62 308 0.9364 0.9981 0.9663 0.9353 0.9002 0.8275 0.3279 0.9004 0.8264 0.0132 0.0644 0.1107
fpn-standard 32928 2162 99 312 0.9384 0.9970 0.9668 0.9363 0.9062 0.8267 0.3331 0.9064 0.8257 0.0133 0.0630 0.1074
fpn-ulsam 32868 2171 140 322 0.9380 0.9958 0.9660 0.9349 0.8995 0.8255 0.3185 0.8998 0.8244 0.0131 0.0575 0.1192

Model Complexity & MCU Latency

Model Params FLOPs TFLite Size Inference Latency (ESP32-S3)
fpn 137,557 0.0350 GFLOPs 269.10 KB 215 ms
fpn + ulsam 140,009 0.0359 GFLOPs 420.39 KB 300 ms
sharedneck 113,525 0.0217 GFLOPs 240.52 KB 190 ms
sharedneck + ulsam 113,993 0.0224 GFLOPs 263.57 KB 195 ms

Interpretation: sharedneck reduces params/FLOPs and improves MCU latency without sacrificing detection quality. Adding ULSAM slightly increases size/latency while keeping overall metrics competitive—use it if you need extra robustness under challenging lighting/backgrounds.


🔧 Configuration

Core training hyperparameters live in lpd/configs/default_config.py. Notable settings include:

  • IMAGE_HEIGHT, IMAGE_WIDTH – patch size (96×96) used for training.
  • AUGMENTATION_PROBS_CONFIG – scenario-aware augmentation schedules.
  • TFRECORD_PATH, MODEL_PATH, RESULTS_PATH, LOG_DIR – runtime directories (point to /app/... in-container).

Adjust these values or provide alternative config modules as needed for your experiments.


🤝 Contributing

PRs and issues are welcome! Please describe your environment, dataset protocol, and steps to reproduce.


🪪 License

This project is licensed under the Apache‑2.0 License (see LICENSE).

About

Ultra-light license plate detection for embedded MCUs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published