Weakly Supervised Generative Network for Multiple 3D Human Pose Hypotheses

13 Aug 2020  ·  Chen Li, Gim Hee Lee ·

3D human pose estimation from a single image is an inverse problem due to the inherent ambiguity of the missing depth. Several previous works addressed the inverse problem by generating multiple hypotheses. However, these works are strongly supervised and require ground truth 2D-to-3D correspondences which can be difficult to obtain. In this paper, we propose a weakly supervised deep generative network to address the inverse problem and circumvent the need for ground truth 2D-to-3D correspondences. To this end, we design our network to model a proposal distribution which we use to approximate the unknown multi-modal target posterior distribution. We achieve the approximation by minimizing the KL divergence between the proposal and target distributions, and this leads to a 2D reprojection error and a prior loss term that can be weakly supervised. Furthermore, we determine the most probable solution as the conditional mode of the samples using the mean-shift algorithm. We evaluate our method on three benchmark datasets -- Human3.6M, MPII and MPI-INF-3DHP. Experimental results show that our approach is capable of generating multiple feasible hypotheses and achieves state-of-the-art results compared to existing weakly supervised approaches. Our source code is available at the project website.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Multi-Hypotheses 3D Human Pose Estimation Human3.6M Li et al. Average MPJPE (mm) 73.9 # 9
Average PMPJPE (mm) 44.3 # 5
3D Human Pose Estimation Human3.6M WSGAN Average MPJPE (mm) 81.1 # 293
3D Human Pose Estimation MPI-INF-3DHP WSGAN PCK 79.3 # 65

Methods


No methods listed for this paper. Add relevant methods here