Mesh Graphormer

ICCV 2021  ·  Kevin Lin, Lijuan Wang, Zicheng Liu ·

We present a graph-convolution-reinforced transformer, named Mesh Graphormer, for 3D human pose and mesh reconstruction from a single image. Recently both transformers and graph convolutional neural networks (GCNNs) have shown promising progress in human mesh reconstruction. Transformer-based approaches are effective in modeling non-local interactions among 3D mesh vertices and body joints, whereas GCNNs are good at exploiting neighborhood vertex interactions based on a pre-specified mesh topology. In this paper, we study how to combine graph convolutions and self-attentions in a transformer to model both local and global interactions. Experimental results show that our proposed method, Mesh Graphormer, significantly outperforms the previous state-of-the-art methods on multiple benchmarks, including Human3.6M, 3DPW, and FreiHAND datasets. Code and pre-trained models are available at

PDF Abstract ICCV 2021 PDF ICCV 2021 Abstract

Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
3D Human Pose Estimation 3DPW Mesh Graphormer PA-MPJPE 45.6 # 3
MPJPE 74.7 # 7
MPVPE 87.7 # 3
3D Hand Pose Estimation FreiHAND Mesh Graphormer PA-MPVPE 5.9 # 1
PA-MPJPE 6 # 1
PA-F@5mm 76.4 # 1
PA-F@15mm 98.6 # 1
3D Human Pose Estimation Human3.6M Mesh Graphormer Average MPJPE (mm) 51.2 # 136
PA-MPJPE 34.5 # 7


No methods listed for this paper. Add relevant methods here