Pose Estimation Models


Introduced by Jin et al. in Whole-Body Human Pose Estimation in the Wild

ZoomNet is a 2D human whole-body pose estimation technique. It aims to localize dense landmarks on the entire human body including face, hands, body, and feet. ZoomNet follows the top-down paradigm. Given a human bounding box of each person, ZoomNet first localizes the easy-to-detect body keypoints and estimates the rough position of hands and face. Then it zooms in to focus on the hand/face areas and predicts keypoints using features with higher resolution for accurate localization. Unlike previous approaches which usually assemble multiple networks, ZoomNet has a single network that is end-to-end trainable. It unifies five network heads including the human body pose estimator, hand and face detectors, and hand and face pose estimators into a single network with shared low-level features.

Source: Whole-Body Human Pose Estimation in the Wild


Paper Code Results Date Stars


Task Papers Share
2D Human Pose Estimation 2 28.57%
Pose Estimation 2 28.57%
Object Detection 1 14.29%
Facial Landmark Detection 1 14.29%
Hand Pose Estimation 1 14.29%


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign
