109 papers with code • 7 benchmarks • 8 datasets
Keypoint detection involves simultaneously detecting people and localizing their keypoints. Keypoints are the same thing as interest points. They are spatial locations, or points in the image that define what is interesting or what stand out in the image. They are invariant to image rotation, shrinkage, translation, distortion, and so on.
( Image credit: PifPaf: Composite Fields for Human Pose Estimation; "Learning to surf" by fotologic, license: CC-BY-2.0 )
We start from a high-resolution subnetwork as the first stage, gradually add high-to-low resolution subnetworks one by one to form more stages, and connect the mutli-resolution subnetworks in parallel.
The goal of this paper is to advance the state-of-the-art of articulated pose estimation in scenes with multiple people.