Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields

CVPR 2017  ·  Zhe Cao, Tomas Simon, Shih-En Wei, Yaser Sheikh ·

We present an approach to efficiently detect the 2D pose of multiple people in an image. The approach uses a nonparametric representation, which we refer to as Part Affinity Fields (PAFs), to learn to associate body parts with individuals in the image. The architecture encodes global context, allowing a greedy bottom-up parsing step that maintains high accuracy while achieving realtime performance, irrespective of the number of people in the image. The architecture is designed to jointly learn part locations and their association via two branches of the same sequential prediction process. Our method placed first in the inaugural COCO 2016 keypoints challenge, and significantly exceeds the previous state-of-the-art result on the MPII Multi-Person benchmark, both in performance and efficiency.

PDF Abstract CVPR 2017 PDF CVPR 2017 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Keypoint Detection COCO Part Affinity Fields Validation AP 60.5 # 14
Multi-Person Pose Estimation COCO CMU-Pose AP 0.618 # 13
Multi-Person Pose Estimation COCO test-dev CMU-Pose AP 61.8 # 13
APL 68.2 # 9
APM 57.1 # 10
AP50 84.9 # 7
AP75 67.5 # 8
Keypoint Detection COCO test-dev CMU Pose APL 68.2 # 16
APM 57.1 # 15
AP50 84.9 # 12
AP75 67.5 # 12
AR 66.5 # 11
AR50 87.2 # 8
ARL 74.6 # 7
ARM 60.6 # 7
Pose Estimation COCO test-dev CMU-Pose AP 61.8 # 31
AP50 84.9 # 27
AP75 67.5 # 28
APL 68.2 # 28
AR 66.5 # 25
2D Human Pose Estimation COCO-WholeBody OpenPose WB 33.8 # 5
body 56.3 # 5
foot 53.2 # 3