A real-time computer vision prototype that detects a user’s hand/fingertips using classical image-processing techniques — without MediaPipe, OpenPose, or pose-detection APIs — and triggers visual warnings when the hand approaches a virtual object on the screen.
This POC is built as part of the Arvyax Internship Assignment, demonstrating:
- Real-time hand/fingertip tracking
- Virtual on-screen boundary
- Distance-based SAFE / WARNING / DANGER states
- Clear visual feedback overlay
- CPU-only execution ≥ 8 FPS
Watch the demo here:
Implemented using classical CV methods:
- HSV-based skin color segmentation
- Contours
- Convex Hull
- Convexity Defects
- Fingertip clustering
This enables detection of multiple fingertips, ensuring even a single finger approaching the boundary triggers detection.
A rectangular region on the right-hand side of the screen acts as a danger zone.
Its color changes based on distance classification.
For every frame:
- Detect fingertips
- Compute distance from each fingertip to the virtual box
- Choose the minimum distance
- Classify into:
| State | Description |
|---|---|
| SAFE | Hand is comfortably far |
| WARNING | Hand approaching box |
| DANGER | Fingertip touching/very close to box |
Threshold values are configurable.
- Tracking dots on detected fingertip locations
- Bounding box around the virtual object
- Top-left state indicator
- Center-screen flashing DANGER DANGER message
- FPS counter
The prototype achieves:
- 8–20 FPS on CPU-only
- No GPU or heavy ML models used
- Lightweight OpenCV + NumPy pipeline
- Python 3.x
- OpenCV
- NumPy
- Classical CV algorithms (no deep learning APIs)
git clone https://github.com/rohankharche34/proxi-track
cd proxi-trackpip install -r requirements.txtpython main.py