Single-Shot Multi-Person 3D Pose Estimation From Monocular RGB

We propose a new single-shot method for multi-person 3D pose estimation in general scenes from a monocular RGB camera. Our approach uses novel occlusion-robust pose-maps (ORPM) which enable full body pose inference even under strong partial occlusions by other people and objects in the scene. ORPM outputs a fixed number of maps which encode the 3D joint locations of all people in the scene. Body part associations allow us to infer 3D pose for an arbitrary number of people without explicit bounding box prediction. To train our approach we introduce MuCo-3DHP, the first large scale training data set showing real images of sophisticated multi-person interactions and occlusions. We synthesize a large corpus of multi-person images by compositing images of individual people (with ground truth from mutli-view performance capture). We evaluate our method on our new challenging 3D annotated multi-person test set MuPoTs-3D where we achieve state-of-the-art performance. To further stimulate research in multi-person 3D pose estimation, we will make our new datasets, and associated code publicly available for research purposes.

PDF Abstract


Introduced in the Paper:


Used in the Paper:

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
3D Human Pose Estimation Human3.6M Single-Shot Multi-Person Average MPJPE (mm) 69.9 # 294
3D Human Pose Estimation MPI-INF-3DHP Single-Shot Multi-Person AUC 37.8 # 72
MPJPE 122.2 # 84
PCK 75.2 # 78
3D Multi-Person Pose Estimation (root-relative) MuPoTS-3D Single-Shot Multi-Person 3DPCK 65 # 17

Results from Other Papers

Task Dataset Model Metric Name Metric Value Rank Source Paper Compare
3D Multi-Person Pose Estimation (root-relative) MuPoTS-3D Mehta et al. MPJPE 132 # 3


No methods listed for this paper. Add relevant methods here