Keypoint Detection

150 papers with code • 7 benchmarks • 11 datasets

Keypoint Detection involves simultaneously detecting people and localizing their keypoints. Keypoints are the same thing as interest points. They are spatial locations, or points in the image that define what is interesting or what stand out in the image. They are invariant to image rotation, shrinkage, translation, distortion, and so on.

( Image credit: PifPaf: Composite Fields for Human Pose Estimation; "Learning to surf" by fotologic, license: CC-BY-2.0 )

Benchmarks

Add a Result

These leaderboards are used to track progress in Keypoint Detection

Dataset	Best Model	Compare
MS COCO	4xRSN-50(384×288)	See all
COCO test-dev	HRNet*	See all
MPII Multi-Person	AlphaPose	See all
OCHuman	MIPNet (HRNet-W48)	See all
COCO test-challenge	Simple Base+*	See all
Pascal3D+	ConvNet + deformable shape model	See all
ApolloCar3D	GSNet	See all

Libraries

Use these libraries to find Keypoint Detection models and implementations

open-mmlab/mmpose

12 papers

4,966

osmr/imgclsmob

6 papers

2,917

PaddlePaddle/PaddleDetection

5 papers

12,029

CMU-Perceptual-Computing-Lab/openpo…

3 papers

29,805

See all 10 libraries.

Datasets

Latest papers

Most implemented Social Latest No code

Self-supervised Learning of Contextualized Local Visual Embeddings

sthalles/clove • • 1 Oct 2023

We present Contextualized Local Visual Embeddings (CLoVE), a self-supervised convolutional-based method that learns representations suited for dense prediction tasks.

01 Oct 2023

Paper
Code

EP2P-Loc: End-to-End 3D Point to 2D Pixel Localization for Large-Scale Visual Localization

minnjung/ep2p-loc • • ICCV 2023

Visual localization is the task of estimating a 6-DoF camera pose of a query image within a provided 3D reference map.

14 Sep 2023

Paper
Code

InstructDiffusion: A Generalist Modeling Interface for Vision Tasks

cientgu/instructdiffusion • • 7 Sep 2023

We present InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.

332

07 Sep 2023

Paper
Code

Improving the matching of deformable objects by learning to detect keypoints

verlab/learningtodetect_prl_2023 • • 1 Sep 2023

We propose a novel learned keypoint detection method to increase the number of correct matches for the task of non-rigid image correspondence.

01 Sep 2023

Paper
Code

A lightweight 3D dense facial landmark estimation model from position map data

shubhajitbasak/dense3dfacelandmarks • • 29 Aug 2023

As there is no public dataset available containing dense landmarks, we propose a pipeline to create a dense keypoint training dataset containing 520 key points across the whole face from an existing facial position map data.

29 Aug 2023

Paper
Code

Neural Interactive Keypoint Detection

idea-research/click-pose • • ICCV 2023

Click-Pose explores how user feedback can cooperate with a neural keypoint detector to correct the predicted keypoints in an interactive way for a faster and more effective annotation process.

20 Aug 2023

Paper
Code

DeDoDe: Detect, Don't Describe -- Describe, Don't Detect for Local Feature Matching

parskatt/dedode • • 16 Aug 2023

To train a descriptor, we maximize the mutual nearest neighbour objective over the keypoints with a separate network.

282

16 Aug 2023

Paper
Code

CoDeF: Content Deformation Fields for Temporally Consistent Video Processing

qiuyu96/codef • • 15 Aug 2023

We present the content deformation field CoDeF as a new type of video representation, which consists of a canonical content field aggregating the static contents in the entire video and a temporal deformation field recording the transformations from the canonical image (i. e., rendered from the canonical content field) to each individual frame along the time axis. Given a target video, these two fields are jointly optimized to reconstruct it through a carefully tailored rendering pipeline. We advisedly introduce some regularizations into the optimization process, urging the canonical content field to inherit semantics (e. g., the object shape) from the video. With such a design, CoDeF naturally supports lifting image algorithms for video processing, in the sense that one can apply an image algorithm to the canonical image and effortlessly propagate the outcomes to the entire video with the aid of the temporal deformation field. We experimentally show that CoDeF is able to lift image-to-image translation to video-to-video translation and lift keypoint detection to keypoint tracking without any training. More importantly, thanks to our lifting strategy that deploys the algorithms on only one image, we achieve superior cross-frame consistency in processed videos compared to existing video-to-video translation approaches, and even manage to track non-rigid objects like water and smog. Project page can be found at https://qiuyu96. github. io/CoDeF/.

4,749

15 Aug 2023

Paper
Code

2D3D-MATR: 2D-3D Matching Transformer for Detection-free Registration between Images and Point Clouds

minhaolee/2d3dmatr • • ICCV 2023

The commonly adopted detect-then-match approach to registration finds difficulties in the cross-modality cases due to the incompatible keypoint detection and inconsistent feature description.

10 Aug 2023

Paper
Code

Reverse Knowledge Distillation: Training a Large Model using a Small One for Retinal Image Matching on Limited Data

SaharAlmahfouzNasser/MeDAL-Retina • 20 Jul 2023

We propose a novel approach based on reverse knowledge distillation to train large models with limited data while preventing overfitting.

20 Jul 2023

Paper
Code

Keypoint Detection

Benchmarks Add a Result

Libraries

Datasets

Latest papers

Content

Benchmarks

Add a Result