Deep High-Resolution Representation Learning for Human Pose Estimation

CVPR 2019  ยท  Ke Sun, Bin Xiao, Dong Liu, Jingdong Wang ยท

This is an official pytorch implementation of Deep High-Resolution Representation Learning for Human Pose Estimation. In this work, we are interested in the human pose estimation problem with a focus on learning reliable high-resolution representations. Most existing methods recover high-resolution representations from low-resolution representations produced by a high-to-low resolution network. Instead, our proposed network maintains high-resolution representations through the whole process. We start from a high-resolution subnetwork as the first stage, gradually add high-to-low resolution subnetworks one by one to form more stages, and connect the mutli-resolution subnetworks in parallel. We conduct repeated multi-scale fusions such that each of the high-to-low resolution representations receives information from other parallel representations over and over, leading to rich high-resolution representations. As a result, the predicted keypoint heatmap is potentially more accurate and spatially more precise. We empirically demonstrate the effectiveness of our network through the superior pose estimation results over two benchmark datasets: the COCO keypoint detection dataset and the MPII Human Pose dataset. The code and models have been publicly available at \url{https://github.com/leoxiaobin/deep-high-resolution-net.pytorch}.

PDF Abstract CVPR 2019 PDF CVPR 2019 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Pose Estimation AIC HRNet (HRNet-w48 ) AP 33.5 # 5
AP50 78.0 # 2
AP75 23.6 # 2
AR 37.9 # 2
AR50 80.0 # 2
Pose Estimation BRACE HRNet pre-trained on COCO Average Precision 0.158 # 2
Average Recall 0.202 # 2
Pose Estimation BRACE HRNet fine-tuned on BRACE Average Precision 0.357 # 1
Average Recall 0.445 # 1
Instance Segmentation COCO minival HTC (HRNetV2p-W48) mask AP 41.0 # 68
Keypoint Detection COCO test-dev HRNet APL 81.5 # 3
APM 71.9 # 4
AP50 92.5 # 3
AP75 83.3 # 4
AR 80.5 # 4
Keypoint Detection COCO test-dev HRNet* APL 83.1 # 1
APM 73.4 # 1
AP50 92.7 # 2
AP75 84.5 # 1
AR 82.0 # 1
Pose Estimation COCO test-dev HRNet-W48 + extra data AP 77 # 12
AP50 92.7 # 9
AP75 84.5 # 11
APL 83.1 # 8
APM 73.4 # 13
AR 82 # 9
Pose Estimation COCO val2017 HRNet (256x192) AP 75.3 # 7
AP50 - # 8
AP75 - # 8
AR - # 8
2D Human Pose Estimation COCO-WholeBody HRNet WB 43.2 # 12
body 65.9 # 12
foot 31.4 # 14
face 52.3 # 13
hand 30.0 # 14
3D Pose Estimation HARPER HRNet + Depth Average MPJPE (mm) 151 # 1
2D Pose Estimation HARPER HRNet PCK 86,8 # 1
2D Human Pose Estimation Human-Art HRNet-w48 AP 0.417 # 5
AP (gt bbox) 0.769 # 3
2D Human Pose Estimation Human-Art HRNet-w32 AP 0.399 # 7
AP (gt bbox) 0.754 # 5
Pose Estimation MPII Human Pose HRNet-W32 PCKh-0.5 92.3 # 11
Keypoint Detection MS COCO HRNet-48(384x288) Validation AP 76.3 # 10
Test AP 75.5 # 9
Keypoint Detection MS COCO HRNet-32 Validation AP 75.8 # 12
Pose Tracking PoseTrack2017 HRNet-W48 COCO MOTA 57.93 # 4
mAP 74.95 # 1

Results from Other Papers


Task Dataset Model Metric Name Metric Value Rank Uses Extra
Training Data
Source Paper Compare
Pose Estimation AIC HRNet (HRNet-w32) AP 32.3 # 6
AP50 76.2 # 3
AP75 21.9 # 3
AR 36.6 # 3
AR50 78.9 # 3

Methods