DeeperCut: A Deeper, Stronger, and Faster Multi-Person Pose Estimation Model

The goal of this paper is to advance the state-of-the-art of articulated pose estimation in scenes with multiple people. To that end we contribute on three fronts. We propose (1) improved body part detectors that generate effective bottom-up proposals for body parts; (2) novel image-conditioned pairwise terms that allow to assemble the proposals into a variable number of consistent body part configurations; and (3) an incremental optimization strategy that explores the search space more efficiently thus leading both to better performance and significant speed-up factors. Evaluation is done on two single-person and two multi-person pose estimation benchmarks. The proposed approach significantly outperforms best known multi-person pose estimation results while demonstrating competitive performance on the task of single person pose estimation. Models and code available at

PDF Abstract

Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Pose Estimation Leeds Sports Poses ResNet-152 + intermediate supervision PCK 90.1% # 13
Keypoint Detection MPII Multi-Person DeeperCut mAP@0.5 59.4% # 9
Multi-Person Pose Estimation MPII Multi-Person DeeperCut AP 59.4% # 9
Multi-Person Pose Estimation WAF DeeperCut AOP 88.10% # 1

Results from Other Papers

Task Dataset Model Metric Name Metric Value Rank Source Paper Compare
Pose Estimation MPII Human Pose ResNet-152 + intermediate supervision PCKh-0.5 88.52% # 30