Keypoint Detection
150 papers with code • 7 benchmarks • 11 datasets
Keypoint Detection involves simultaneously detecting people and localizing their keypoints. Keypoints are the same thing as interest points. They are spatial locations, or points in the image that define what is interesting or what stand out in the image. They are invariant to image rotation, shrinkage, translation, distortion, and so on.
( Image credit: PifPaf: Composite Fields for Human Pose Estimation; "Learning to surf" by fotologic, license: CC-BY-2.0 )
Libraries
Use these libraries to find Keypoint Detection models and implementationsDatasets
Most implemented papers
Key.Net: Keypoint Detection by Handcrafted and Learned CNN Filters
We introduce a novel approach for keypoint detection task that combines handcrafted and learned CNN filters within a shallow multi-scale architecture.
GLAMpoints: Greedily Learned Accurate Match points
We introduce a novel CNN-based feature point detector - GLAMpoints - learned in a semi-supervised manner.
PVN3D: A Deep Point-wise 3D Keypoints Voting Network for 6DoF Pose Estimation
Our method is a natural extension of 2D-keypoint approaches that successfully work on RGB based 6DoF estimation.
Improving Convolutional Networks With Self-Calibrated Convolutions
Recent advances on CNNs are mostly devoted to designing more complex architectures to enhance their representation learning capacity.
HoughNet: Integrating near and long-range evidence for visual detection
This paper presents HoughNet, a one-stage, anchor-free, voting-based, bottom-up object detection method.
RegionViT: Regional-to-Local Attention for Vision Transformers
The regional-to-local attention includes two steps: first, the regional self-attention extract global information among all regional tokens and then the local self-attention exchanges the information among one regional token and the associated local tokens via self-attention.
Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation
This paper presents a novel end-to-end framework with Explicit box Detection for multi-person Pose estimation, called ED-Pose, where it unifies the contextual learning between human-level (global) and keypoint-level (local) information.
PifPaf: Composite Fields for Human Pose Estimation
We propose a new bottom-up method for multi-person 2D human pose estimation that is particularly well suited for urban mobility such as self-driving cars and delivery robots.
Pose Neural Fabrics Search
Neural Architecture Search (NAS) technologies have emerged in many domains to jointly learn the architectures and weights of the neural network.
R2D2: Reliable and Repeatable Detector and Descriptor
We thus propose to jointly learn keypoint detection and description together with a predictor of the local descriptor discriminativeness.