🖐️ HandFree - Hybrid Air-Writing Digit Recognition System

A sophisticated webcam-based air-writing system that recognizes handwritten digits (0-9) using MediaPipe hand tracking, Kalman filtering, and a hybrid recognition approach combining structural rules with machine learning.

✨ Features

🎯 Core Capabilities

Real-time hand tracking using MediaPipe Hands
Strict Index-finger air-writing - Write digits with only your index finger raised
Kalman filtering - Smooth, jitter-free cursor tracking
5-Layer Intent Filter - Eliminates noise, accidental dots, and wild swipes
Hybrid recognition system - Combines structural validation with CNN
Continuous stroke rendering - No gaps or broken lines

🔧 Advanced Features

Strict finger detection - Pen ON only when index finger is raised (all others closed)
Fist-to-stop - Make a fist to immediately stop writing
Temporal gating - 50ms buffer to prevent micro-jitter
Spatial filtering - 3px minimum movement for smooth curves
Velocity guards - Rejects movements faster than 1500 px/s
Directional filtering - Allows up to 150° angle changes for natural writing
Structural validation - Loop count, aspect ratio, stroke direction analysis
Confidence-based rejection - Only accepts predictions above 60% confidence
Visual debugging - Auto-saves processed images for inspection

🚀 Quick Start

Prerequisites

Python 3.8+
Webcam (30 FPS recommended)
Windows/Linux/macOS

Installation

# Clone the repository
cd HandFree

# Create virtual environment
python -m venv venv

# Activate virtual environment
# Windows:
.\venv\Scripts\activate
# Linux/Mac:
source venv/bin/activate

# Install dependencies
pip install opencv-python mediapipe tensorflow numpy scikit-learn joblib

Running the Application

python main.py

🎮 How to Use

Gesture Controls

Raise ONLY index finger → Pen ON (start writing)
- Thumb, middle, ring, pinky MUST be closed
Raise any other finger → Pen OFF (stop writing)
Make a fist → Force Pen OFF (explicit stop)
Wait 0.8 seconds with pen OFF → Auto-recognize digit
Press 'c' → Clear canvas
Press 'q' → Quit

Writing Tips

Keep only index finger raised - Any other finger will stop writing
Write larger digits (use most of the screen)
Write at natural speed - System handles smoothing automatically
The Kalman filter smooths jitter - No need to hold perfectly still
Follow canonical digit forms (see guide below)

📐 Canonical Digit Forms

The system recognizes standard digit shapes:

Digit	Key Features	Loop Count
0	Circular/oval loop	1
1	Tall vertical line	0
2	Curved top, diagonal, flat base	0
3	S-curve	0
4	Vertical + diagonal + optional bar	0
5	Top horizontal bar + bottom curve	0
6	Loop at bottom	1
7	Top horizontal + diagonal	0
8	Two stacked loops	2
9	Loop at top + tail	1

See Canonical_Digit_Recognition_Guide_0_to_9.md for detailed specifications.

🏗️ Architecture

System Components

┌─────────────────────────────────────────────────────┐
│                   Main Application                   │
│                     (main.py)                        │
└─────────────────────────────────────────────────────┘
                          │
        ┌─────────────────┼─────────────────┬──────────┐
        ▼                 ▼                 ▼          ▼
┌──────────────┐  ┌──────────────┐  ┌──────────────┐ ┌──────────────┐
│ Hand Tracker │  │  Stabilizer  │  │Curve Renderer│ │   Hybrid     │
│(MediaPipe)   │  │(Kalman +     │  │(Adaptive     │ │ Recognizer   │
│              │  │ Intent)      │  │Interpolation)│ │(Rules + CNN) │
└──────────────┘  └──────────────┘  └──────────────┘ └──────────────┘
                          │                 │                 │
                          ▼                 ▼                 ▼
                  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐
                  │   Feature    │  │    Stroke    │  │   Gesture    │
                  │  Extractor   │  │   Analysis   │  │  Detection   │
                  └──────────────┘  └──────────────┘  └──────────────┘

Key Modules

main.py - Main application loop, UI, and recognition pipeline
hand_tracker.py - MediaPipe hand tracking and strict finger state detection
stabilizer.py - Kalman filtering and 5-layer intent noise filtering
curve_renderer.py - Adaptive interpolation for curve-aware rendering
hybrid_recognizer.py - Hybrid recognition (structural rules + CNN)
feature_extractor.py - Structural feature extraction (loops, aspect ratio, etc.)
stroke_analysis.py - Stroke statistics and validation

🧠 Recognition Pipeline

Hybrid Air-Writing Pipeline

Hand Tracking - MediaPipe detects hand and finger states
Gesture Recognition - Strict index-only detection (Pen ON/OFF)
Kalman Filtering - Smooth, jitter-free position tracking
Intent Filtering - 5-layer noise suppression:
- Temporal: 50ms minimum stroke duration
- Spatial: 3px minimum movement
- Velocity: 20-1500 px/s range
- Directional: ≤150° angle changes
- Structural: Minimum stroke length
Stroke Rendering - Continuous line drawing with adaptive interpolation
Feature Extraction - Analyze loops, aspect ratio, stroke direction
Candidate Filtering - Eliminate structurally impossible digits
CNN Inference - Restricted to valid candidates only
Confidence Check - Reject predictions below 60% confidence
Rule Verification - Final validation against digit definitions
Accept or Reject - Conservative decision making

Stabilization Features

Kalman Filter - Process noise: 0.03, Measurement noise: 8.0
Intent Filters - Multi-layer noise rejection
Continuous Strokes - No gaps between points
Velocity Guards - Automatic spike detection and rejection

📊 Configuration

Main Parameters (in `main.py`)

CANVAS_SIZE = 28           # 28x28 to match sklearn model
STROKE_THICKNESS = 6       # Stroke width in pixels (increased for visibility)
CONFIDENCE_THRESHOLD = 0.60 # Minimum confidence to accept

Stabilizer Parameters (in `stabilizer.py`)

# Kalman Filter
PROCESS_NOISE = 0.03       # Responsiveness
MEASUREMENT_NOISE = 8.0    # Jitter reduction

# Intent Filter
MIN_DURATION_MS = 50       # Temporal gate (reduced from 100ms)
MIN_DIST_PX = 3            # Spatial threshold (reduced from 5px)
MAX_VELOCITY = 1500        # px/s (increased from 1000)
MAX_ANGLE_CHANGE = 150     # degrees (increased from 110)

Hand Tracking (in `hand_tracker.py`)

DETECTION_CONFIDENCE = 0.8  # Hand detection threshold
TRACKING_CONFIDENCE = 0.8   # Hand tracking threshold

🔍 Debugging

Visual Debugging

The system auto-saves processed images:

Location: d:\Programming\Projects\HandFree\debug_digit_XXX.png
Size: 280x280 (scaled from 28x28 for visibility)
Format: Binary (white on black)

Console Output

🔄 (640,360)→(680,400) [d:56.6px, θ:35°, pts:57]  ← Curve detected
→ (680,400)→(685,405) [d:7.1px, θ:5°, pts:5]     ← Line detected

============================================================
Recognition #1
============================================================
Structural Features:
  Loop count: 1
  Aspect ratio: 1.85
  Vertical ratio: 0.68
  Diagonal: False
Structural candidates: [0, 6, 9]
Method: hybrid
Recognized: 0 (confidence: 0.882)
Result: ✓ 0 (conf: 0.88, candidates: [0, 6, 9])
============================================================

🐛 Troubleshooting

Common Issues

Issue: Digits not recognized

Solution: Write larger, slower, and follow canonical forms

Issue: Curves appear broken

Solution: System now has adaptive interpolation - restart app

Issue: Too many rejections

Solution: Confidence threshold lowered to 60% - should be better

Issue: Wrong digit recognized

Solution: Check saved images - if they look correct, model may need retraining

See recognition_diagnostic.md for detailed troubleshooting.

📚 Documentation

Canonical_Digit_Recognition_Guide_0_to_9.md - Digit specifications
walkthrough.md - Complete system walkthrough
hybrid_test_guide.md - Testing the hybrid system
canonical_updates.md - Recent rule updates
recognition_diagnostic.md - Troubleshooting guide

🎓 Educational Use

This system is designed for teacher-safe operation:

✅ Conservative rejection over incorrect guessing
✅ Interpretable decisions (shows why digits were accepted/rejected)
✅ Follows canonical educational digit forms
✅ Visual feedback for learning

🔬 Technical Details

Structural Features Extracted

Loop count (0, 1, or 2)
Stroke count
Aspect ratio (height/width)
Vertical motion ratio
Horizontal motion ratio
Diagonal presence
Horizontal segment positions (top/middle/bottom)
Loop position (top/middle/bottom)
Curvature patterns (S-curve, C-curve)

Validation Criteria

Minimum pixel density: 2%
Minimum stroke area: 1% of canvas
Minimum bounding box: 2% of canvas
Minimum perimeter: 30% of canvas dimension

🚧 Known Limitations

CNN Model: Trained on EMNIST (printed digits), may not perfectly match air-writing style
Lighting: Requires good lighting for hand tracking
Camera: 30 FPS minimum recommended for smooth curves
Hand Size: Works best with hand filling ~50% of frame

🔮 Future Improvements

Train CNN on air-written digits specifically
Add letter recognition (A-Z)
Multi-digit number recognition
Gesture-based commands (undo, redo)
Save/load written text
Export to text file

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
debug_files		debug_files
.gitignore		.gitignore
README.md		README.md
curve_renderer.py		curve_renderer.py
feature_extractor.py		feature_extractor.py
generate_synthetic_data.py		generate_synthetic_data.py
hand_tracker.py		hand_tracker.py
hybrid_recognizer.py		hybrid_recognizer.py
main.py		main.py
requirements.txt		requirements.txt
stabilizer.py		stabilizer.py
stroke_analysis.py		stroke_analysis.py
train_crnn_model.py		train_crnn_model.py
train_model.py		train_model.py
train_model_sklearn.py		train_model_sklearn.py
verification_result.txt		verification_result.txt
verify_digits.py		verify_digits.py
verify_output.txt		verify_output.txt

Potri2357/HandFree

Folders and files

Latest commit

History

Repository files navigation