Skip to content

IRVLUTD/HO-Cap-Annotation

Repository files navigation

HO-Cap Annotation Pipeline

Python PyTorch CUDA License

This repository contains the code for the HO-Cap annotation pipeline.

Contents

Installation

This code is tested with Python 3.10 and CUDA 11.8 on Ubuntu 20.04. Make sure CUDA 11.8 is installed on your system before running the code.

1. Clone the repository

git clone https://github.com/JWRoboticsVision/HO-Cap-Annotation.git

2. Change current directory to the repository

cd HO-Cap-Annotation

3. Create conda environment

conda create -n hocap-annotation python=3.10

conda activate hocap-annotation

4. Install PyTorch

python -m pip install torch==2.5.1 torchvision==0.20.1 --index-url https://download.pytorch.org/whl/cu118 --no-cache-dir

5. Install the HO-Cap Annotation Package

bash ./scripts/install_hocap-annotation.sh

6. Download MANO models

Download MANO models and code (mano_v1_2.zip) from the MANO website and place the extracted .pkl files under config/mano_models directory. The directory should look like this:

./config/mano_models
├── MANO_LEFT.pkl
└── MANO_RIGHT.pkl

7. Install Third-Party Tools (Optional)

7.1 Install FoundationPose
  • Initialize and build FoundationPose:
    bash ./scripts/install_foundationpose.sh
  • Download checkpoints
    bash ./scripts/download_models.sh --foundationpose
    
7.2 Install SAM2
  • Initialize and build SAM2:
    bash ./scripts/install_sam2.sh
  • Download checkpoints
    bash ./scripts/download_models.sh --sam2

Usage

1. Segment the Sequence

python tools/01_video_segmentation.py --sequence_folder <path_to_sequence_folder>
Input Mask SAM2 Video Segmentation
Input Mask SAM2 Video Segmentation

2. 2D Hand Detection by MediaPipe

python tools/02_mp_hand_detection.py --sequence_folder <path_to_sequence_folder>

2d_hand_detection

3. 3D Hand Joints Estimation

python tools/03_mp_3d_joints_generation.py --sequence_folder <path_to_sequence_folder>

3d_hand_joints_estimation

4. Object Pose Estimation by FoundationPose

  • Run FoundationPose on each camera view
python tools/04-1_fd_pose_solver.py --sequence_folder <path_to_sequence_folder> --object_idx <object_idx>
  • Merge the results from all views
python tools/04-2_fd_pose_merger.py --sequence_folder <path_to_sequence_folder>

object_pose_estimation

5. Hand Pose Optimization

python tools/05_mano_pose_solver.py --sequence_folder <path_to_sequence_folder>

hand_pose_optimization

6. Object Pose Optimization

python tools/06_object_pose_solver.py --sequence_folder <path_to_sequence_folder>

object_pose_optimization

7. Hand-Object Joint Optimization

python tools/07_joint_pose_solver.py --sequence_folder <path_to_sequence_folder>

joint_pose_optimization

8. HoloLens Pose Refinement

python tools/08_holo_pose_solver.py --sequence_folder <path_to_sequence_folder>

holo_pose_refinement