DiffPose: Multi-hypothesis Human Pose Estimation using Diffusion Models (ICCV2023)

Authors:

Overview:

This repository contains the code and some pre-trained models for our diffusion-based multi-hypothesis 3D human pose estimation method.

Abstract:

Traditionally, monocular 3D human pose estimation employs a machine learning model to predict the most likely 3D pose for a given input image. However, a single image can be highly ambiguous and induces multiple plausible solutions for the 2D-3D lifting step, which results in overly confident 3D pose predictors. To this end, we propose DiffPose, a conditional diffusion model that predicts multiple hypotheses for a given input image. Compared to similar approaches, our diffusion model is straightforward and avoids intensive hyperparameter tuning, complex network structures, mode collapse, and unstable training.

Moreover, we tackle the problem of over-simplification of the intermediate representation of the common two-step approaches which first estimate a distribution of 2D joint locations via joint-wise heatmaps and consecutively use their maximum argument for the 3D pose estimation step. Since such a simplification of the heatmaps removes valid information about possibly correct, though labeled unlikely, joint locations, we propose to represent the heatmaps as a set of 2D joint candidate samples. To extract information about the original distribution from these samples, we introduce our embedding transformer which conditions the diffusion model.

Experimentally, we show that DiffPose improves upon the state of the art for multi-hypothesis pose estimation by 3-5% for simple poses and outperforms it by a large margin for highly ambiguous poses.

Paper:

Paper accepted for oral presentation at ICCV2023 in Paris and can be found here DiffPose

Affiliation:

Computer Vision Laboratories (CVL) at Linköping University, Sweden

Installation

We recommend creating a clean conda environment. You can do this as follows:

conda env create -f environment.yml

After the installation is complete, you can activate the conda environment by running:

conda activate DiffPose

Usage

Observer that some plotting functionalities can be limited without a wandb account, please use '--do_not_use_wandb' in this case.

Training

Our main experiments can be trained using:

python train.py --config diffpose.yaml --seed 42

For the other experiments their respective config files can be found at experiments/iccv2023. And the used random seeds in experiments/random_seeds.txt

Evaluation

To evaluate the code separately from training:

python eval.py --config diffpose.yaml

Demo

We provide demo functionalities in the demo folder for running inference of a trained model on a given image. Observe that the images are scaled to 255x255, to improve performance, make sure that most of the images consists of the person in question and not background. The 2D detector will also struggle if multiple persons are in the frame, leading to sub-optimal performance of our method.

Pre-trained 2D detector

This repository contains both the fine-tuned network weights used by Wehrbein et.al. and the non-finetuned weights it was based on from HRNet.

The '--use_orig_hrnet' flag used when preprocessing the datasets, selects the non-finetuned weights when used.

Pre-trained model weights

The pre-trained 2D detector weights and the five models trained on H36M can be found on Google Drive

Trained Model Weights for DiffPose

These are the model weights for the 5 different seeds used for evaluating our method

2D Detector used	Random Seed for Diffpose	PA-MPJPE on H36M	PA-MPJPE on H36MA	Link to Model weight
Fine-tuned H36M	42	30.526	46.116	Seed 42
Fine-tuned H36M	2967	30.618	46.661	Seed 2967
Fine-tuned H36M	6173	30.745	46.808	Seed 6173
Fine-tuned H36M	5478	30.964	46.813	Seed 5478
Fine-tuned H36M	989	31.028	47.134	Seed 989

Model Weights for 2D joint detector

These are the model weights for the original model as well as the ones that have been fine-tuned on the 2D data from H36M.

Training Data	Link to Model weight
Oiginal weights (MPII w/o finetuning)	Original
MPII w/ Fine-tuning on H36M (as previous methods)	Fined Tuned

For generating the dataset, please download the weights for the 2D joint detector and place them in data/preprocessing/hrnet.

Datasets

Human3.6m

We provide tools for preprocessing the Human3.6M dataset, creating both the full split and the harder set of ambiguous samples proposed by Wehrbein et.al. in data/preprocessing/H36M.py.

Please note that due to licensing of the original dataset we cannot provide you with the data, neither can we help with getting access to it excepting for directing you towards the official website: Human 3.6M

MPI-INF-3DHP

Similarly, we provide preprocessing tools for 3DHP in data/preprocessing/3DHP.py.

Acknowledgements:

Thanks to this great repo which served as a starting point for the implementation of the diffusion model used in this work.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Images		Images
data		data
demo		demo
experiments		experiments
models		models
utils		utils
viz		viz
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config.py		config.py
environment.yml		environment.yml
eval.py		eval.py
eval_offline.py		eval_offline.py
eval_qualitative.py		eval_qualitative.py
eval_skipose.py		eval_skipose.py
find_most_uncertain_samples.py		find_most_uncertain_samples.py
generate_posefile.py		generate_posefile.py
plot_qualitative_results.py		plot_qualitative_results.py
train.py		train.py
visualize.py		visualize.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DiffPose: Multi-hypothesis Human Pose Estimation using Diffusion Models (ICCV2023)

Authors:

Overview:

Abstract:

Paper:

Affiliation:

Installation

Usage

Training

Evaluation

Demo

Pre-trained 2D detector

Pre-trained model weights

Trained Model Weights for DiffPose

Model Weights for 2D joint detector

Datasets

Human3.6m

MPI-INF-3DHP

Acknowledgements:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DiffPose: Multi-hypothesis Human Pose Estimation using Diffusion Models (ICCV2023)

Authors:

Overview:

Abstract:

Paper:

Affiliation:

Installation

Usage

Training

Evaluation

Demo

Pre-trained 2D detector

Pre-trained model weights

Trained Model Weights for DiffPose

Model Weights for 2D joint detector

Datasets

Human3.6m

MPI-INF-3DHP

Acknowledgements:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages