MKA-Gradio-Demo.mp4
Create and configure the conda environment:
conda create -n mka python=3.8 -y
conda activate mka
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
conda install https://anaconda.org/pytorch3d/pytorch3d/0.7.5/download/linux-64/pytorch3d-0.7.5-py38_cu117_pyt1131.tar.bz2
pip install -r requirements.txt
cd dependencies
git clone https://github.com/ViTAE-Transformer/ViTPose.git
cd ..
pip install -v -e dependencies/ViTPose
cd dependencies/cpp_module
sh install.sh
cd ../..If build cpp_module fail, you can try conda-based gcc:
conda install -c conda-forge gcc=9 gxx=9
conda install -c conda-forge libxcryptYou also need to setup SAM2 environment following facebookresearch/sam2
Human body model files are required for body, hand, and face parameterization.
human_models/
│── human_models.py
└── human_model_files/
├── J_regressor_extra.npy
├── J_regressor_h36m.npy
├── mano_mean_params.npz
├── smpl_mean_params.npz
├── smpl/
│ ├── SMPL_FEMALE.pkl
│ ├── SMPL_MALE.pkl
│ └── SMPL_NEUTRAL.pkl
├── smplx/
│ ├── SMPLX_FEMALE.pkl
│ ├── SMPLX_MALE.pkl
│ ├── SMPLX_NEUTRAL.pkl
│ ├── SMPLX_to_J14.pkl
│ ├── SMPL-X__FLAME_vertex_ids.npy
│ └── MANO_SMPLX_vertex_ids.pkl
└── mano/
└── MANO_RIGHT.pkl
Here we provide some download links for the files:
Pretrained models are required for pose detection and human mesh recovery.
pretrained_models/
├── yolov8x.pt
├── sam2.1_hiera_large.pt
├── hamer_ckpts/
│ ├── dataset_config.yaml
│ ├── model_config.yaml
│ └── checkpoints/
│ └── hamer.ckpt
├── smplest_x_h/
│ ├── config_base.py
│ └── smplest_x_h.pth.tar
└── vitpose_ckpts/
└── vitpose+_huge/
└── wholebody.pth
Some instructions:
To execute the full processing pipeline:
bash run_pipeline.shExplore More Motrix Projects
- [SMPL-X] [TPAMI'25] SMPLest-X: An extended version of SMPLer-X with stronger foundation models.
- [SMPL-X] [NeurIPS'23] SMPLer-X: Scaling up EHPS towards a family of generalist foundation models.
- [SMPL-X] [ECCV'24] WHAC: World-grounded human pose and camera estimation from monocular videos.
- [SMPL-X] [CVPR'24] AiOS: An all-in-one-stage pipeline combining detection and 3D human reconstruction.
- [SMPL-X] [NeurIPS'23] RoboSMPLX: A framework to enhance the robustness of whole-body pose and shape estimation.
- [SMPL-X] [ICML'25] ADHMR: A framework to align diffusion-based human mesh recovery methods via direct preference optimization.
- [SMPL-X] MKA: Full-body 3D mesh reconstruction from single- or multi-view RGB videos.
- [SMPL] [ICCV'23] Zolly: 3D human mesh reconstruction from perspective-distorted images.
- [SMPL] [IJCV'26] PointHPS: 3D HPS from point clouds captured in real-world settings.
- [SMPL] [NeurIPS'22] HMR-Benchmarks: A comprehensive benchmark of HPS datasets, backbones, and training strategies.
- [SMPL-X] [ICLR'26] ViMoGen: A comprehensive framework that transfers knowledge from ViGen to MoGen across data, modeling, and evaluation.
- [SMPL-X] [ECCV'24] LMM: Large Motion Model for Unified Multi-Modal Motion Generation.
- [SMPL-X] [NeurIPS'23] FineMoGen: Fine-Grained Spatio-Temporal Motion Generation and Editing.
- [SMPL] InfiniteDance: A large-scale 3D dance dataset and an MLLM-based music-to-dance model designed for robust in-the-wild generalization.
- [SMPL] [NeurIPS'23] InsActor: Generating physics-based human motions from language and waypoint conditions via diffusion policies.
- [SMPL] [ICCV'23] ReMoDiffuse: Retrieval-Augmented Motion Diffusion Model.
- [SMPL] [TPAMI'24] MotionDiffuse: Text-Driven Human Motion Generation with Diffusion Model.