3D Human Pose Estimation
309 papers with code • 25 benchmarks • 47 datasets
3D Human Pose Estimation is a computer vision task that involves estimating the 3D positions and orientations of body joints and bones from 2D images or videos. The goal is to reconstruct the 3D pose of a person in real-time, which can be used in a variety of applications, such as virtual reality, human-computer interaction, and motion analysis.
Libraries
Use these libraries to find 3D Human Pose Estimation models and implementationsDatasets
Subtasks
Most implemented papers
Multi-Garment Net: Learning to Dress 3D People from Images
We present Multi-Garment Network (MGN), a method to predict body shape and clothing, layered on top of the SMPL model from a few frames (1-8) of a video.
CLIFF: Carrying Location Information in Full Frames into Human Pose and Shape Estimation
Top-down methods dominate the field of 3D human pose and shape estimation, because they are decoupled from human detection and allow researchers to focus on the core problem.
MogaNet: Multi-order Gated Aggregation Network
Notably, MogaNet hits 80. 0\% and 87. 8\% accuracy with 5. 2M and 181M parameters on ImageNet-1K, outperforming ParC-Net and ConvNeXt-L, while saving 59\% FLOPs and 17M parameters, respectively.
V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map
To overcome these weaknesses, we firstly cast the 3D hand and human pose estimation problem from a single depth map into a voxel-to-voxel prediction that uses a 3D voxelized grid and estimates the per-voxel likelihood for each keypoint.
Semantic Graph Convolutional Networks for 3D Human Pose Regression
In this paper, we study the problem of learning Graph Convolutional Networks (GCNs) for regression.
VIBE: Video Inference for Human Body Pose and Shape Estimation
Human motion is fundamental to understanding behavior.
XNect: Real-time Multi-Person 3D Motion Capture with a Single RGB Camera
The first stage is a convolutional neural network (CNN) that estimates 2D and 3D pose features along with identity assignments for all visible joints of all individuals. We contribute a new architecture for this CNN, called SelecSLS Net, that uses novel selective long and short range skip connections to improve the information flow allowing for a drastically faster network without compromising accuracy.
Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image
Although significant improvement has been achieved recently in 3D human pose estimation, most of the previous methods only treat a single-person case.
Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose
This paper addresses the challenge of 3D human pose estimation from a single color image.
PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization
Although current approaches have demonstrated the potential in real world settings, they still fail to produce reconstructions with the level of detail often present in the input images.