Learning Delicate Local Representations for Multi-Person Pose Estimation

In this paper, we propose a novel method called Residual Steps Network (RSN). RSN aggregates features with the same spatial size (Intra-level features) efficiently to obtain delicate local representations, which retain rich low-level spatial information and result in precise keypoint localization. Additionally, we observe the output features contribute differently to final performance. To tackle this problem, we propose an efficient attention mechanism - Pose Refine Machine (PRM) to make a trade-off between local and global representations in output features and further refine the keypoint locations. Our approach won the 1st place of COCO Keypoint Challenge 2019 and achieves state-of-the-art results on both COCO and MPII benchmarks, without using extra training data and pretrained model. Our single model achieves 78.6 on COCO test-dev, 93.0 on MPII test dataset. Ensembled models achieve 79.2 on COCO test-dev, 77.1 on COCO test-challenge dataset. The source code is publicly available for further research at https://github.com/caiyuanhao1998/RSN/

PDF Abstract ECCV 2020 PDF ECCV 2020 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Keypoint Detection COCO 4xRSN-50(384×288) Test AP 78.6 # 1
Multi-Person Pose Estimation COCO RSN AP 0.792 # 2
Keypoint Detection COCO test-challenge 4×RSN-50 AR 82.6 # 1
ARM 78.0 # 1
AP 77.1 # 1
AP50 93.3 # 1
AP75 83.6 # 1
APL 82.6 # 4
AR50 96.1 # 1
AR75 88.2 # 1
ARL 88.7 # 1
Pose Estimation COCO test-dev 4xRSN-50 (ensemble) AP 79.2 # 4
AP50 94.4 # 3
AP75 87.1 # 3
APL 76.1 # 23
APM 83.8 # 1
AR 84.1 # 4
Pose Estimation COCO test-dev 4xRSN-50 AP 78.6 # 7
AP50 94.3 # 4
AP75 86.6 # 4
APL 75.5 # 26
APM 83.3 # 2
AR 83.8 # 5
Pose Estimation MPII Human Pose 4xRSN-50 PCKh-0.5 93.0 # 5
Pose Estimation MPII Single Person 4xRSN-50 PCKh@0.5 93 # 1

Methods


No methods listed for this paper. Add relevant methods here