Convolutional Pose Machines

Pose Machines provide a sequential prediction framework for learning rich implicit spatial models. In this work we show a systematic design for how convolutional networks can be incorporated into the pose machine framework for learning image features and image-dependent spatial models for the task of pose estimation. The contribution of this paper is to implicitly model long-range dependencies between variables in structured prediction tasks such as articulated pose estimation. We achieve this by designing a sequential architecture composed of convolutional networks that directly operate on belief maps from previous stages, producing increasingly refined estimates for part locations, without the need for explicit graphical model-style inference. Our approach addresses the characteristic difficulty of vanishing gradients during training by providing a natural learning objective function that enforces intermediate supervision, thereby replenishing back-propagated gradients and conditioning the learning procedure. We demonstrate state-of-the-art performance and outperform competing methods on standard benchmarks including the MPII, LSP, and FLIC datasets.

PDF Abstract CVPR 2016 PDF CVPR 2016 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Car Pose Estimation ApolloCar3D CPM Detection Rate 75.4 # 2
Pose Estimation FLIC Elbows Convolutional Pose Machines PCK@0.2 97.59% # 2
Pose Estimation FLIC Wrists Convolutional Pose Machines PCK@0.2 95.03% # 2

Results from Other Papers


Task Dataset Model Metric Name Metric Value Rank Source Paper Compare
Pose Estimation J-HMDB CPM Mean PCK@0.2 91.9 # 3
Pose Estimation Leeds Sports Poses Convolutional Pose Machines PCK 90.5% # 11
Pose Estimation MPII Human Pose Convolutional Pose Machines PCKh-0.5 88.52% # 29
3D Human Pose Estimation Total Capture Tri-CPM Average MPJPE (mm) 99 # 11

Methods


No methods listed for this paper. Add relevant methods here