ZoomNet is a 2D human whole-body pose estimation technique. It aims to localize dense landmarks on the entire human body including face, hands, body, and feet. ZoomNet follows the top-down paradigm. Given a human bounding box of each person, ZoomNet first localizes the easy-to-detect body keypoints and estimates the rough position of hands and face. Then it zooms in to focus on the hand/face areas and predicts keypoints using features with higher resolution for accurate localization. Unlike previous approaches which usually assemble multiple networks, ZoomNet has a single network that is end-to-end trainable. It unifies five network heads including the human body pose estimator, hand and face detectors, and hand and face pose estimators into a single network with shared low-level features.
Source: Whole-Body Human Pose Estimation in the WildPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
2D Human Pose Estimation | 2 | 28.57% |
Pose Estimation | 2 | 28.57% |
Object Detection | 1 | 14.29% |
Facial Landmark Detection | 1 | 14.29% |
Hand Pose Estimation | 1 | 14.29% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |