Expressive and Versatile Avatar Tracker

This repository serves as a comprehensive and flexible toolbox for human mesh estimation. It supports multiple input modalities and workflows, including:

Estimating proxy meshes from human-centric images and videos.
Estimating SMPL-X parameters from colored meshes (experimental).
Recovering SMPL-X parameters from a given SMPL-X mesh.

Installation

Setting up the environment from scratch can be somewhat involved due to the dependencies required by this project. For convenience, we provide a pre-built Docker image on Docker Hub:

docker pull zjwfufu/eva-tracker:latest

After launching a container with this image, simply run:

source activate
conda activate eva_track

to enter the pre-configured environment.

Note

A few components may still need to be re-installed:

Packages previously installed via pip install -e . should be re-installed inside the activated enviroment.

# sam2
pip install -e ./modules/models/image_segmenter/sam2 --no-deps

# osx modified mmpose
cd ./modules/models/mesh_estimator/osx/transformer_utils
pip install -e .
cd ../../../../../
pip install --upgrade setuptools

PyTorch3D may need to be rebuilt depending on your GPU architecture and CUDA configuration.

If you prefer to install the environment manually, you can follow the step-by-step instructions below:

Clone this repository and create conda environment

git clone https://github.com/zjwfufu/eva_tracker

conda create -n eva_track python=3.10

Install PyTorch and PyTorch3D

pip install torch==2.3.0 torchvision==0.18.0 --index-url https://download.pytorch.org/whl/cu118

pip install -U xformers==0.0.26.post1 --index-url https://download.pytorch.org/whl/cu118

pip install "git+https://github.com/facebookresearch/pytorch3d.git"

Install other requirements

pip install -r requirements.txt

pip install chumpy==0.70 --no-build-isolation

Install SAM2

pip install -e ./modules/models/image_segmenter/sam2 --no-deps

Install OSX dependencies
- Install mmcv-full==1.7.0
  
  Install mmcv-full 1.7.0 compiled with C++17. You can download the source code from this link, which has been patched to compile with C++17 during installation.
  
  After downloading and extracting the archive, navigate to the source directory and run:
```
MMCV_WITH_OPS=1 pip install . -v
```
- Install modified mmpose
```
cd ./modules/models/mesh_estimator/osx/transformer_utils
pip install -e .
cd ../../../../../
pip install --upgrade setuptools
```
- Install mmengine
```
pip install mmengine==0.10.7
```

Install TrustNCG optimizer

pip install "git+https://github.com/vchoutas/torch-trust-ncg.git"

Fix numpy version
```
pip install numpy==1.23.0
```

Weights

We provide a full archive of all the pretrained weights required by this project, available at this link. Please download and extract the archive into ./modules/weights. After extraction, the directory structure as follow:

./modules/weights/
├── face_tracker
│   ├── emica
│   │   ├── EMICA-CVT_flame2020_notexture.pt
│   │   └── ins_scrfd_10g_bnkps.onnx
│   ├── flame
│   │   ├── canonical.obj
│   │   └── FLAME_with_eye.pt
│   ├── mask2former
│   │   ├── config.json
│   │   ├── gitattributes
│   │   ├── model.safetensors
│   │   ├── preprocessor_config.json
│   │   ├── pytorch_model.bin
│   │   └── README.md
│   ├── matting
│   │   └── stylematte_synth.pt
│   └── vgghead
│       ├── lmks_2d.pt
│       └── vgg_heads_l.trcd
├── hand_tracker
│   ├── hamer.ckpt
├── human_template
│   ├── flame
│   │   ├── 2019
│   │   ├── flame_dynamic_embedding.npy
│   │   ├── FLAME_FEMALEl.pkl
│   │   ├── FLAME_MALE.pkl
│   │   ├── FLAME_NEUTRAL.pkl
│   │   ├── flame_static_embedding.pkl
│   │   ├── FLAME_texture.npz
│   │   └── Readme.pdf
│   ├── flame_assets
│   │   ├── flame
│   │   ├── flame_arkit_bs.npy
│   │   ├── pred_expression.json
│   │   ├── shoulder_mesh.obj
│   │   └── teeth_jawopen_offset.npy
│   ├── mano
│   │   ├── MANO_LEFT.pkl
│   │   └── MANO_RIGHT.pkl
│   ├── mano_mean_params.npz
│   ├── pose_estimate
│   │   └── multiHMR_896_L.pt
│   ├── smpl
│   │   ├── SMPL_FEMALE.pkl
│   │   ├── SMPL_MALE.pkl
│   │   └── SMPL_NEUTRAL.pkl
│   ├── smpl_mean_params.npz
│   ├── smplx
│   │   ├── MANO_SMPLX_vertex_ids.pkl
│   │   ├── SMPLX_FEMALE.npz
│   │   ├── SMPL-X__FLAME_vertex_ids.npy
│   │   ├── smplx_flip_correspondences.npz
│   │   ├── SMPLX_MALE.npz
│   │   ├── SMPLX_NEUTRAL.npz
│   │   ├── smplx_npz.zip
│   │   ├── SMPLX_to_J14.pkl
│   │   ├── smplx_uv
│   │   └── version.txt
│   └── smplx_points
├── image_matting
│   └── BiRefNet-general-epoch_244.pth
├── image_segmenter
│   └── sam2.1_hiera_large.pt
├── keypoint_detector
│   └── sapiens_1b_coco_wholebody_best_coco_wholebody_AP_727_torchscript.pt2
└── mesh_estimator
  ├── multiHMR_896_L.pt
  └── osx_l.pth.tar

Usage

Discussions

Estimating human proxy meshes (FLAME, MANO, SMPL-X) from monocular observations is inherently ill-posed and highly sensitive to initialization and priors.

Estimator backbone. Most pipelines rely on a pretrained HMR model to provide an initial SMPL-X estimate. However, many existing HMR models are primarily supervised with 2D signals during training, which introduces depth ambiguities. As a result, the predicted bodies may lean forward/backward, and the relative distances between hands, arms, and torso can become inconsistent, causing misalignment. More advanced models such as SAM-3D-Body show promising potential for alleviating these issues.

Temporal smoothness. When tracking trimmed videos, temporal smoothness must be applied carefully. For tasks that only require per-frame accuracy, a simple second-order smoothness term is often sufficient. For applications like animation or cross reenactment, temporal regularization requires more careful tuning to avoid over-smoothing important dynamics.

Blinking. This project does not explicitly model eyelid poses. Some blink motions represented in FLAME’s expression blendshapes may be suppressed by strong smoothness terms and thus become less noticeable during optimization.

Acknowledgement

This project is built on many amazing open-source projects:

and many research works:

Thanks all the authors for their great work.

Cite

If you find this project useful in your research, please cite with the following BibTex entry:

@misc{zhang2025evatracker,
  title={Expressive and Versatile Avatar Tracker},
  author={Zhang, Jiawei},
  year={2025},
  month={dec},
  url={https://github.com/zjwfufu/EVA_tracker}
}

@article{zhang2025bringingportrait3dpresence,
  title={Bringing Your Portrait to 3D Presence},
  author={Zhang, Jiawei and Chu, Lei and Li, Jiahao and Zang, Zhenyu and Li, Chong and Li, Xiao and Cao, Xun and Zhu, Hao and Lu, Yan},
  journal={arXiv preprint arXiv:2511.22553},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
assets		assets
common		common
config		config
demo		demo
docs		docs
modules		modules
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
image_track.py		image_track.py
mesh_track.py		mesh_track.py
requirements.txt		requirements.txt
smplx_fitting.py		smplx_fitting.py
video_track.py		video_track.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Expressive and Versatile Avatar Tracker

Installation

Weights

Usage

Image Tracking

Video Tracking

SMPL-X Fitting

Mesh Tracking

Discussions

Acknowledgement

Cite

About

Uh oh!

Releases

Packages

Languages

License

zjwfufu/EVA-Tracker

Folders and files

Latest commit

History

Repository files navigation

Expressive and Versatile Avatar Tracker

Installation

Weights

Usage

Image Tracking

Video Tracking

SMPL-X Fitting

Mesh Tracking

Discussions

Acknowledgement

Cite

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages