Skip to content

A Hand Detection CNN Model train on FreiHAND_pub_v2 Dataset with labels

Notifications You must be signed in to change notification settings

devkala05/Hand_Detection_Model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ–οΈ Hand Pose Estimation with PyTorch

This project focuses on predicting 21 hand keypoints (x, y) from RGB images using a Convolutional Neural Network (CNN) trained on FreiHAND_pub_v2 dataset. It lays the foundation for real-time gesture-based control systems (an Upcoming Project - HOLOCONTROL) .


Dataset Description

Source: FreiHAND dataset Type: RGB images with annotated 2D keypoints Usage: We use the public subset containing ~32,000 images with corresponding joint labels.

  • Images:

    • Total: 32,560
    • Size: Resized to 128x128
    • Format: .jpg
  • Keypoints:

    • 21 keypoints per hand, flattened to 42 values (x1, y1, x2, y2, ..., x21, y21)
    • Normalized in the range [0, 1] using image width and height

Model Architecture

Built using PyTorch:

Input: (3, 128, 128)

β†’ Conv2D(3 β†’ 32) + ReLU + MaxPool2d
β†’ Conv2D(32 β†’ 64) + ReLU + MaxPool2d
β†’ Conv2D(64 β†’ 128) + ReLU + MaxPool2d
β†’ Conv2D(128 β†’ 256) + ReLU + MaxPool2d
β†’ Flatten
β†’ Linear(256*8*8 β†’ 512) + ReLU
β†’ Linear(512 β†’ 42)
  • Output: 42 values representing (x, y) coordinates of 21 keypoints

Training Setup

  • Loss Function: Mean Squared Error (MSE)
  • Optimizer: Adam (learning rate = 1e-4)
  • Batch Size: 32
  • Early Stopping: Patience of 5 epochs
  • Best Model Saved as: best_hand_model.pth

About

A Hand Detection CNN Model train on FreiHAND_pub_v2 Dataset with labels

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published