Permutation-Invariant Relational Network for Multi-person 3D Pose Estimation

11 Apr 2022  ·  Nicolas Ugrinovic, Adria Ruiz, Antonio Agudo, Alberto Sanfeliu, Francesc Moreno-Noguer ·

The recovery of multi-person 3D poses from a single RGB image is a severely ill-conditioned problem due to the inherent 2D-3D depth ambiguity, inter-person occlusions, and body truncations. To tackle these issues, recent works have shown promising results by simultaneously reasoning for different people. However, in most cases this is done by only considering pairwise person interactions, hindering thus a holistic scene representation able to capture long-range interactions. This is addressed by approaches that jointly process all people in the scene, although they require defining one of the individuals as a reference and a pre-defined person ordering, being sensitive to this choice. In this paper, we overcome both these limitations, and we propose an approach for multi-person 3D pose estimation that captures long-range interactions independently of the input order. For this purpose, we build a residual-like permutation-invariant network that successfully refines potentially corrupted initial 3D poses estimated by an off-the-shelf detector. The residual function is learned via Set Transformer blocks, that model the interactions among all initial poses, no matter their ordering or number. A thorough evaluation demonstrates that our approach is able to boost the performance of the initially estimated 3D poses by large margins, achieving state-of-the-art results on standardized benchmarks. Additionally, the proposed module works in a computationally efficient manner and can be potentially used as a drop-in complement for any 3D pose detector in multi-people scenes.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
3D Multi-Person Pose Estimation (absolute) MuPoTS-3D PIRN 3DPCK 44.1 # 4
3D Multi-Person Pose Estimation (root-relative) MuPoTS-3D PIRN 3DPCK 85.8 # 4
3D Multi-Person Pose Estimation Panoptic PIRN Average MPJPE (mm) 49.8 # 11

Methods