We propose a new single-shot method for multi-person 3D pose estimation in
general scenes from a monocular RGB camera. Our approach uses novel
occlusion-robust pose-maps (ORPM) which enable full body pose inference even
under strong partial occlusions by other people and objects in the scene. ORPM
outputs a fixed number of maps which encode the 3D joint locations of all
people in the scene. Body part associations allow us to infer 3D pose for an
arbitrary number of people without explicit bounding box prediction. To train
our approach we introduce MuCo-3DHP, the first large scale training data set
showing real images of sophisticated multi-person interactions and occlusions.
We synthesize a large corpus of multi-person images by compositing images of
individual people (with ground truth from mutli-view performance capture). We
evaluate our method on our new challenging 3D annotated multi-person test set
MuPoTs-3D where we achieve state-of-the-art performance. To further stimulate
research in multi-person 3D pose estimation, we will make our new datasets, and
associated code publicly available for research purposes.