-
Notifications
You must be signed in to change notification settings - Fork 11
Enable Model Quantization for Performance Gains #21
Copy link
Copy link
Open
Labels
Level3Gssoc Level 3Gssoc Level 3enhancementNew feature or requestNew feature or requestgssoc25GirlScript Summer of CodeGirlScript Summer of Code
Description
Files Affected:
fire_detection.pygear_detection.pyr_zone.py
📌 Current Model Info
- Models are running in FP32 precision (default).
- Framework:
ultralytics.YOLO. - No quantization or hardware-specific optimization applied.
🐞 Problem
- High memory consumption during inference.
- Slow performance on edge devices (Raspberry Pi, Jetson Nano, low-end CPU).
- Wasted compute power because lower precision can still maintain accuracy.
✅ Steps to Reproduce
- Run any detection script on CPU-only hardware.
- Observe memory usage and low FPS performance.
💡 Suggested Improvement
-
Allow contributors to integrate quantized versions of YOLO:
- FP16 (half precision)
- INT8 (8-bit quantization)
-
Add model export scripts to TensorRT / ONNX Runtime / TFLite for optimized inference.
-
Provide a config flag like
--quantized trueto toggle quantized inference.
📊 Expected Outcome
- Up to 2–4× faster inference on supported devices.
- Reduced memory footprint.
- Better deployment support for edge hardware.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
Level3Gssoc Level 3Gssoc Level 3enhancementNew feature or requestNew feature or requestgssoc25GirlScript Summer of CodeGirlScript Summer of Code