Skip to content

Kartikdangi1/hand_tracking_simulation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

Redesigned ROS2 Hand Tracking Node (MediaPipe Hands + DexHand)

Thanks for the clarification. I’ve redesigned and optimized the ROS 2 hand tracking node using MediaPipe and DexHand to ensure:

  • Smooth, real-time hand pose tracking from /dev/video2
  • Elimination of jitter and spikes in the landmark stream
  • Accurate hand position tracking (orientation excluded as requested)
  • Reliable data feeding into joint states for RViz and potential robot actuation

Note: While these changes greatly reduce noise, there are still jitters in the trajectory. A proper mapping (e.g., depth calibration or hand-to-camera mapping) is required to eliminate the remaining jitter entirely.

I’ll get back to you shortly with the updated code structure and implementation details.


Overview of Changes

The original ROS 2 Humble node combined MediaPipe hand tracking with a DexHand URDF for visualization and control. We have redesigned the node for smoother and more robust hand tracking, emphasizing hand position rather than full orientation. Key improvements include:

  • Reduced Jitter and Spikes: Enhanced filtering of landmark data to eliminate sudden jumps, using tuned low-pass filtering (and suggesting Kalman filtering) to smooth the output. The node no longer publishes zeros immediately when tracking is lost, preventing abrupt spikes when the hand momentarily leaves the frame.
  • Position-Only Orientation: We focus on hand and finger positions instead of computing full 3D orientation for each joint. By deriving joint angles directly from landmark positions (e.g. angles between finger bone segments), we avoid unstable quaternion calculations that could mislead the robot model. Each finger’s yaw (lateral spread) and pitch (flexion) angles are computed from landmark geometry, ensuring the /joint_states reflect true finger bending without noisy orientation flips.
  • GPU Acceleration Enabled: If a GPU is available, the MediaPipe hand tracker will utilize it for faster processing. Running the tracking on a GPU can significantly improve frame rate and responsiveness.
  • ROS 2 Integration: The node continues to use cv_bridge to convert images from the v4l2_camera feed (/dev/video2/image_raw) and publishes results compatible with RViz and the DexHand URDF. We publish the hand’s joint angles to /joint_states and provide an optional PoseArray topic (disabled by default) to visualize the hand’s skeletal landmark positions in RViz.

Screenshot from 2025-06-18 02-46-39

Improved Smoothing and Stability

Tracking noise and inference gaps can cause jittery landmark data. The original node applied a One‑Euro filter to each landmark stream. We build on this by tuning the filter and adding additional stability measures:

  • Tuned One‑Euro Filtering: Lowered the cutoff frequency and adjusted beta to filter out high-frequency noise more aggressively, at the cost of a small amount of lag for smoother visualization.
  • Optional Kalman Filter: Suggested for further stability to predict and correct landmark motion, especially during rapid movements or brief occlusions.
  • No Immediate Zeroing on Loss: When the hand is lost briefly, we hold the last valid pose instead of zeroing, preventing one-frame spikes. After a configurable timeout, the node resets gracefully.
  • Joint Angle Smoothing: After computing joint angles, we apply a deadband threshold and low-pass smoothing to ignore tiny flickers (e.g. ±2°) and ensure only significant movements update the joint states.

These measures address MediaPipe’s occasional flicker, eliminating visual chatter and providing downstream controllers a smoother trajectory.


Position-Based Joint Angle Computation (Excluding Orientation)

Rather than estimating full 3D orientations, we derive joint angles directly from landmark positions. This aligns with DexHand’s demo and avoids ambiguous quaternion calculations.

  1. Finger Flexion (Pitch) Angles: Computed via the angle between consecutive bone segments (e.g. MCP→PIP vs. PIP→DIP) using dot‑products. Angles are clipped to safe ranges (±90°) and published in radians.
  2. Finger Abduction/Adduction (Yaw) Angles: Calculated by projecting finger direction vectors onto the palm plane and measuring angles relative to a forward reference (wrist→middle MCP), signed by lateral position.
  3. Wrist and Pose Orientation: Wrist orientation is fixed at 0; all PoseArray orientations use identity quaternions. We trust only relative joint angles, avoiding global orientation noise.

This method ensures robust, interpretable joint states, perfectly suited for visualization and actuation.


Node Implementation

The updated Python ROS 2 node consists of two classes:

  1. HandDetectionNode: Captures frames from /dev/video2, runs MediaPipe Hands, applies One‑Euro filtering, and publishes filtered landmarks.
  2. HandPosePublisher: Subscribes to landmarks, computes joint angles from positions, applies angle smoothing, and publishes JointState (and optionally PoseArray).

Refer to hand_detection_node.py for detailed, well‑commented code including all filter parameters, angle computations, and ROS 2 setup.


Launching the Nodes

From your workspace root directory (tendon_ws):

# Display the DexHand model in RViz
ros2 launch dexhand_description display.launch.py

# Run the hand detection and tracking node
ros2 launch hand_detection hand_detection.launch.py

You can interrupt with Ctrl+C when you need to shut down.


Hardware & Tuning Suggestions

  • High-Quality Camera: Higher resolution/FPS improves tracking; balance with CPU load.
  • Enable GPU Acceleration: Configure MediaPipe or use the Tasks API with GPU support for real-time performance.
  • Adjust Confidence Thresholds: Tweak min_detection_confidence and min_tracking_confidence for stability vs. sensitivity.
  • Filter Parameters: Fine‑tune One‑Euro cutoff and beta for your use case; optionally implement Kalman filtering.
  • Servo Calibration: Map our radian outputs to your hardware servo ranges and apply a small deadband to prevent motor buzzing.

With these changes, the node delivers smooth, robust hand tracking suitable for driving DexHand in RViz or real hardware.


Sources:

  • MediaPipe Hands API and filtering techniques
  • DexHand documentation and demo notes
  • Adapted original node code

About

Real-time hand tracking and gesture recognition simulation using computer vision

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages