3D Human Pose Estimation Using Möbius Graph Convolutional Networks

20 Mar 2022  ·  Niloofar Azizi, Horst Possegger, Emanuele Rodolà, Horst Bischof ·

3D human pose estimation is fundamental to understanding human behavior. Recently, promising results have been achieved by graph convolutional networks (GCNs), which achieve state-of-the-art performance and provide rather light-weight architectures. However, a major limitation of GCNs is their inability to encode all the transformations between joints explicitly. To address this issue, we propose a novel spectral GCN using the M\"obius transformation (M\"obiusGCN). In particular, this allows us to directly and explicitly encode the transformation between joints, resulting in a significantly more compact representation. Compared to even the lightest architectures so far, our novel approach requires 90-98% fewer parameters, i.e. our lightest M\"obiusGCN uses only 0.042M trainable parameters. Besides the drastic parameter reduction, explicitly encoding the transformation of joints also enables us to achieve state-of-the-art results. We evaluate our approach on the two challenging pose estimation benchmarks, Human3.6M and MPI-INF-3DHP, demonstrating both state-of-the-art results and the generalization capabilities of M\"obiusGCN.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
3D Human Pose Estimation Human3.6M MöbiusGCN (GTi) Average MPJPE (mm) 36.2 # 58
Using 2D ground-truth joints Yes # 2
Multi-View or Monocular Monocular # 1
3D Human Pose Estimation Human3.6M MöbiusGCN Average MPJPE (mm) 52.1 # 193
Using 2D ground-truth joints No # 2
Multi-View or Monocular Monocular # 1

Methods