I2L-MeshNet: Image-to-Lixel Prediction Network for Accurate 3D Human Pose and Mesh Estimation from a Single RGB Image

ECCV 2020  ·  Gyeongsik Moon, Kyoung Mu Lee ·

Most of the previous image-based 3D human pose and mesh estimation methods estimate parameters of the human mesh model from an input image. However, directly regressing the parameters from the input image is a highly non-linear mapping because it breaks the spatial relationship between pixels in the input image. In addition, it cannot model the prediction uncertainty, which can make training harder. To resolve the above issues, we propose I2L-MeshNet, an image-to-lixel (line+pixel) prediction network. The proposed I2L-MeshNet predicts the per-lixel likelihood on 1D heatmaps for each mesh vertex coordinate instead of directly regressing the parameters. Our lixel-based 1D heatmap preserves the spatial relationship in the input image and models the prediction uncertainty. We demonstrate the benefit of the image-to-lixel prediction and show that the proposed I2L-MeshNet outperforms previous methods. The code is publicly available https://github.com/mks0601/I2L-MeshNet_RELEASE.

PDF Abstract ECCV 2020 PDF ECCV 2020 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
3D Human Pose Estimation 3DPW I2L-MeshNet PA-MPJPE 58.6 # 90
MPJPE 93.2 # 94
MPVPE 110.1 # 67
Acceleration Error 30.9 # 21
3D Hand Pose Estimation FreiHAND I2L-MeshNet PA-MPVPE 7.6 # 4
PA-MPJPE 7.4 # 5
PA-F@5mm 68.1 # 4
PA-F@15mm 97.3 # 5
3D Hand Pose Estimation HO-3D I2L-MeshNet Average MPJPE (mm) 26.8 # 7
ST-MPJPE (mm) 26.0 # 10
PA-MPJPE (mm) 11.2 # 11
3D Human Pose Estimation Human3.6M I2L-MeshNet Average MPJPE (mm) 55.7 # 221
PA-MPJPE 41.7 # 67