Is 2D Heatmap Representation Even Necessary for Human Pose Estimation?

7 Jul 2021  ·  YanJie Li, Sen yang, Shoukui Zhang, Zhicheng Wang, Wankou Yang, Shu-Tao Xia, Erjin Zhou ·

The 2D heatmap representation has dominated human pose estimation for years due to its high performance. However, heatmap-based approaches have some drawbacks: 1) The performance drops dramatically in the low-resolution images, which are frequently encountered in real-world scenarios. 2) To improve the localization precision, multiple upsample layers may be needed to recover the feature map resolution from low to high, which are computationally expensive. 3) Extra coordinate refinement is usually necessary to reduce the quantization error of downscaled heatmaps. To address these issues, we propose a \textbf{Sim}ple yet promising \textbf{D}isentangled \textbf{R}epresentation for keypoint coordinate (\emph{SimDR}), reformulating human keypoint localization as a task of classification. In detail, we propose to disentangle the representation of horizontal and vertical coordinates for keypoint location, leading to a more efficient scheme without extra upsampling and refinement. Comprehensive experiments conducted over COCO dataset show that the proposed \emph{heatmap-free} methods outperform \emph{heatmap-based} counterparts in all tested input resolutions, especially in lower resolutions by a large margin. Code will be made publicly available at \url{}.

PDF Abstract

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.