Skip to content

Enable Model Quantization for Performance Gains #21

@SurajSanap

Description

@SurajSanap

Files Affected:

  • fire_detection.py
  • gear_detection.py
  • r_zone.py

📌 Current Model Info

  • Models are running in FP32 precision (default).
  • Framework: ultralytics.YOLO.
  • No quantization or hardware-specific optimization applied.

🐞 Problem

  • High memory consumption during inference.
  • Slow performance on edge devices (Raspberry Pi, Jetson Nano, low-end CPU).
  • Wasted compute power because lower precision can still maintain accuracy.

✅ Steps to Reproduce

  1. Run any detection script on CPU-only hardware.
  2. Observe memory usage and low FPS performance.

💡 Suggested Improvement

  • Allow contributors to integrate quantized versions of YOLO:

    • FP16 (half precision)
    • INT8 (8-bit quantization)
  • Add model export scripts to TensorRT / ONNX Runtime / TFLite for optimized inference.

  • Provide a config flag like --quantized true to toggle quantized inference.

📊 Expected Outcome

  • Up to 2–4× faster inference on supported devices.
  • Reduced memory footprint.
  • Better deployment support for edge hardware.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Level3Gssoc Level 3enhancementNew feature or requestgssoc25GirlScript Summer of Code

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions