Real‑time defect detection using Python + OpenCV with an ML anomaly model (IsolationForest). Designed as an Application Engineering demo for roles like Zebra’s Machine Vision team.
- Defects covered: scratches (line defects), color anomalies (tint/stain), and misalignment (template shift).
- Classical CV + ML hybrid: OpenCV for fast defect cues, IsolationForest for anomaly scoring on features.
- Real‑time overlay: Boxes/masks with per‑defect confidence and overall OK/DEFECT status.
- Synthetic dataset generator so you can train/evaluate right away.
- TensorFlow Lite (optional template): starter code to train a tiny CNN and export to TFLite when TF is available.
# 1) Create venv (recommended)
python -m venv .venv && . .venv/bin/activate # (Windows) .\.venv\Scripts\activate
# 2) Install deps
pip install -r requirements.txt
# 3) Generate a synthetic dataset (OK vs DEFECT images)
python src/generate_synthetic_dataset.py --n_ok 200 --n_defect 200 --out data/train
# 4) Train IsolationForest on features
python src/train_isoforest.py --data data/train --model models/isoforest.pkl
# 5) Run real‑time detector (webcam 0) or pass a video path
python src/realtime_detect.py --model models/isoforest.pkl --source 0
# or
python src/realtime_detect.py --model models/isoforest.pkl --source path/to/video.mp4Tip: You can drop real product images into
data/train/okanddata/train/defectto fine‑tune the model. The generator is just to bootstrap.
MachineVision-DefectDetection/
├─ requirements.txt
├─ README.md
├─ src/
│ ├─ generate_synthetic_dataset.py # creates OK/DEFECT images
│ ├─ feature_utils.py # feature extraction from images/frames
│ ├─ classical_detectors.py # scratch/color/misalignment cues
│ ├─ train_isoforest.py # trains IsolationForest, saves model
│ ├─ realtime_detect.py # real-time overlay via OpenCV
│ └─ tflite_stub_train.py # OPTIONAL: tiny CNN -> TFLite (template)
├─ data/
│ ├─ train/
│ │ ├─ ok/
│ │ └─ defect/
│ └─ test/
└─ models/
- Classical cues are computed per frame: edge density, line count, hue variance, template‑match offset, etc.
- These cues form a feature vector. We fit an IsolationForest on OK samples to learn normal behavior.
- At runtime, we compute features and get an anomaly score. We also draw overlays from classical detectors for explainability.
- The final label is based on ML score + rule thresholds for high‑precision detection.
- Use
--evalontrain_isoforest.pyto print precision/recall on a labeled hold‑out set. - Confusion matrix is stored as
models/metrics.json.
If you have TensorFlow installed:
pip install tensorflow==2.13.0
python src/tflite_stub_train.py --data data/train --out models/tfliteThis trains a small CNN and exports a TFLite model. Use it in realtime_detect.py (see comments) by enabling the TFLite path.
• Built a real‑time defect detection system using OpenCV and an ML anomaly model to identify manufacturing flaws (scratches, color anomalies, misalignments) from camera feeds with 90%+ accuracy on synthetic and sample datasets.