Generating Multiple Hypotheses for 3D Human Pose Estimation with Mixture Density Network

CVPR 2019  ·  Chen Li, Gim Hee Lee ·

3D human pose estimation from a monocular image or 2D joints is an ill-posed problem because of depth ambiguity and occluded joints. We argue that 3D human pose estimation from a monocular input is an inverse problem where multiple feasible solutions can exist. In this paper, we propose a novel approach to generate multiple feasible hypotheses of the 3D pose from 2D joints.In contrast to existing deep learning approaches which minimize a mean square error based on an unimodal Gaussian distribution, our method is able to generate multiple feasible hypotheses of 3D pose based on a multimodal mixture density networks. Our experiments show that the 3D poses estimated by our approach from an input of 2D joints are consistent in 2D reprojections, which supports our argument that multiple solutions exist for the 2D-to-3D inverse problem. Furthermore, we show state-of-the-art performance on the Human3.6M dataset in both best hypothesis and multi-view settings, and we demonstrate the generalization capacity of our model by testing on the MPII and MPI-INF-3DHP datasets. Our code is available at the project website.

PDF Abstract CVPR 2019 PDF CVPR 2019 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Multi-Hypotheses 3D Human Pose Estimation AH36M SMPL-MDN (by 3D Multi-bodies) Best-Hypothesis PMPJPE (n = 25) 69.5 # 5
H36M PMPJPE (n = 25) 42.7 # 4
Most-Likely Hypothesis PMPJPE (n = 1) 74.7 # 3
H36M PMPJPE (n = 1) 44.8 # 3
Best-Hypothesis MPJPE (n = 25) 91.5 # 2
Multi-Hypotheses 3D Human Pose Estimation Human3.6M MDN Average MPJPE (mm) 52.7 # 7
Average PMPJPE (mm) 42.6 # 5
3D Human Pose Estimation Human3.6M MDN (Multi-View) Average MPJPE (mm) 49.6 # 172
Using 2D ground-truth joints No # 2
Multi-View or Monocular Multi-View # 1
3D Human Pose Estimation Human3.6M MDN Average MPJPE (mm) 52.7 # 212
Using 2D ground-truth joints No # 2
Multi-View or Monocular Monocular # 1
PA-MPJPE 42.6 # 90
Monocular 3D Human Pose Estimation Human3.6M Multimodal Mixture Density Networks Average MPJPE (mm) 52.7 # 25
Use Video Sequence No # 1
Frames Needed 1 # 1
Need Ground Truth 2D Pose No # 1
3D Human Pose Estimation MPI-INF-3DHP MDM PCK 67.9 # 84

Methods


No methods listed for this paper. Add relevant methods here