Learning Pose Grammar to Encode Human Body Configuration for 3D Pose Estimation

17 Oct 2017  ·  Hao-Shu Fang, Yuanlu Xu, Wenguan Wang, Xiaobai Liu, Song-Chun Zhu ·

In this paper, we propose a pose grammar to tackle the problem of 3D human pose estimation. Our model directly takes 2D pose as input and learns a generalized 2D-3D mapping function. The proposed model consists of a base network which efficiently captures pose-aligned features and a hierarchy of Bi-directional RNNs (BRNN) on the top to explicitly incorporate a set of knowledge regarding human body configuration (i.e., kinematics, symmetry, motor coordination). The proposed model thus enforces high-level constraints over human poses. In learning, we develop a pose sample simulator to augment training samples in virtual camera views, which further improves our model generalizability. We validate our method on public 3D human pose benchmarks and propose a new evaluation protocol working on cross-view setting to verify the generalization capability of different methods. We empirically observe that most state-of-the-art methods encounter difficulty under such setting while our method can well handle such challenges.

PDF Abstract

Datasets


Results from the Paper


 Ranked #1 on 3D Absolute Human Pose Estimation on Human3.6M (Average MPJPE (mm) metric)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
3D Human Pose Estimation Human3.6M Pose Grammar Average MPJPE (mm) 60.4 # 246
PA-MPJPE 45.7 # 81
3D Absolute Human Pose Estimation Human3.6M Pose Grammar Average MPJPE (mm) 60.4 # 1
3D Human Pose Estimation HumanEva-I Pose Grammar Mean Reconstruction Error (mm) 22.9 # 13

Methods


No methods listed for this paper. Add relevant methods here