A real-time computer vision project that allows simple runner-style games to be controlled using face and hand gestures through a webcam.
This project explores alternative input methods using lightweight computer vision techniques instead of traditional controllers.
Most games assume physical input devices such as keyboards, mice, or touch screens. This project experiments with camera-based interaction as an alternative input modality, focusing on simplicity and real-time responsiveness rather than heavy machine learning models.
- Face-based steering using head tilt
- Gesture-based actions (jump, roll, accelerate)
- Works with existing PC games that use keyboard input
- Runs in real time (~30 FPS) on a standard webcam
- No custom model training required
- Python
- OpenCV
- MediaPipe (Face Mesh & Hands)
- pynput (virtual keyboard input)
Each video frame from the webcam is processed in real time to extract facial and hand landmarks. These landmarks are mapped to predefined gestures, which are then converted into keyboard inputs compatible with runner-style games such as Subway Surfers (emulator) or similar PC games.
The system is designed to be modular so gesture logic and input mappings can be extended or replaced easily.
| Gesture | Action |
|---|---|
| Head tilt left/right | Move left/right |
| Mouth open | Jump |
| Open palm | Accelerate |
| Closed fist | Roll / Brake |
| Two fingers up | Jump |
- Gesture thresholds are sensitive to camera position and lighting
- Designed for simple, discrete actions (not precision control)
- No adaptive learning in the current version
- Calibration step for individual users
- Gesture smoothing and noise reduction
- Custom gesture mapping
- Mobile implementation using on-device vision APIs
pip install -r requirements.txt
python main.py