Rethinking on Multi-Stage Networks for Human Pose Estimation

Existing pose estimation approaches fall into two categories: single-stage and multi-stage methods. While multi-stage methods are seemingly more suited for the task, their performance in current practice is not as good as single-stage methods. This work studies this issue. We argue that the current multi-stage methods' unsatisfactory performance comes from the insufficiency in various design choices. We propose several improvements, including the single-stage module design, cross stage feature aggregation, and coarse-to-fine supervision. The resulting method establishes the new state-of-the-art on both MS COCO and MPII Human Pose dataset, justifying the effectiveness of a multi-stage architecture. The source code is publicly available for further research.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Keypoint Detection COCO MSPN(384x288) Test AP 76.1 # 6
Pose Estimation COCO minival MSPN AP 75.9 # 1
Keypoint Detection COCO test-challenge MSPN+* AR 82.2 # 2
ARM 77.5 # 2
AP 76.4 # 2
AP50 92.9 # 2
AP75 82.6 # 2
APL 88.6 # 1
AR50 96 # 2
AR75 87.7 # 2
ARL 83.2 # 2
Pose Estimation COCO test-dev MSPN AP 76.1 # 14
AP50 93.4 # 8
AP75 83.8 # 12
APL 81.5 # 11
APM 72.3 # 15
AR 81.6 # 11
Keypoint Detection COCO test-dev MSPN APL 81.5 # 3
APM 72.3 # 3
AP50 93.4 # 1
AP75 83.8 # 3
AR 81.6 # 2
AR50 96.3 # 1
AR75 88.1 # 2
ARL 87.1 # 2
ARM 77.5 # 1
AP 76.1 # 1
Pose Estimation MPII Human Pose MSPN PCKh-0.5 92.6% # 6

Methods


No methods listed for this paper. Add relevant methods here