3D Absolute Human Pose Estimation

9 papers with code • 3 benchmarks • 6 datasets

This task aims to solve absolute (camera-centric not root-relative) 3D human pose estimation.

( Image credit: RootNet )

Most implemented papers

RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation

guosheng/refinenet CVPR 2017

Recently, very deep convolutional neural networks (CNNs) have shown outstanding performance in object recognition and have also been the first choice for dense classification problems such as semantic segmentation.

Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image


Although significant improvement has been achieved recently in 3D human pose estimation, most of the previous methods only treat a single-person case.

Fusing Wearable IMUs with Multi-View Images for Human Pose Estimation: A Geometric Approach

CHUNYUWANG/imu-human-pose-pytorch CVPR 2020

Then we lift the multi-view 2D poses to the 3D space by an Orientation Regularized Pictorial Structure Model (ORPSM) which jointly minimizes the projection error between the 3D and 2D poses, along with the discrepancy between the 3D pose and IMU orientations.

MeTRAbs: Metric-Scale Truncation-Robust Heatmaps for Absolute 3D Human Pose Estimation

isarandi/metrabs 12 Jul 2020

Heatmap representations have formed the basis of human pose estimation systems for many years, and their extension to 3D has been a fruitful line of recent research.

End-to-End Human Pose and Mesh Reconstruction with Transformers

microsoft/MeshTransformer CVPR 2021

We present a new method, called MEsh TRansfOrmer (METRO), to reconstruct 3D human pose and mesh vertices from a single image.

Graph and Temporal Convolutional Networks for 3D Multi-person Pose Estimation in Monocular Videos

3dpose/GnTCN 22 Dec 2020

To tackle this problem, we propose a novel framework integrating graph convolutional networks (GCNs) and temporal convolutional networks (TCNs) to robustly estimate camera-centric multi-person 3D poses that do not require camera parameters.

Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

ziniuwan/maed ICCV 2021

To this end, we propose Multi-level Attention Encoder-Decoder Network (MAED), including a Spatial-Temporal Encoder (STE) and a Kinematic Topology Decoder (KTD) to model multi-level attentions in a unified framework.

Out-of-Domain Human Mesh Reconstruction via Dynamic Bilevel Online Adaptation

syguan96/dynaboa 7 Nov 2021

We consider a new problem of adapting a human mesh reconstruction model to out-of-domain streaming videos, where performance of existing SMPL-based models are significantly affected by the distribution shift represented by different camera parameters, bone lengths, backgrounds, and occlusions.

XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model

hkchengrex/XMem 14 Jul 2022

We present XMem, a video object segmentation architecture for long videos with unified feature memory stores inspired by the Atkinson-Shiffrin memory model.