VIBE: Video Inference for Human Body Pose and Shape Estimation

Human motion is fundamental to understanding behavior. Despite progress on single-image 3D pose and shape estimation, existing video-based state-of-the-art methods fail to produce accurate and natural motion sequences due to a lack of ground-truth 3D motion data for training... (read more)

PDF Abstract CVPR 2020 PDF CVPR 2020 Abstract

Results from the Paper


Ranked #6 on 3D Human Pose Estimation on 3DPW (using extra training data)

     Get a GitHub badge
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK USES EXTRA
TRAINING DATA
RESULT BENCHMARK
3D Human Pose Estimation 3DPW VIBE PA-MPJPE 55.9 # 6
MPJPE 93.5 # 9
MPVPE 113.4 # 7
acceleration error 27.1 # 4
Monocular 3D Human Pose Estimation Human3.6M VIBE Average MPJPE (mm) 65.6 # 20
Use Video Sequence Yes # 1
Frames Needed 16 # 20
Need Ground Truth 2D Pose No # 1
3D Human Pose Estimation Human3.6M VIBE Average MPJPE (mm) 65.6 # 42
Using 2D ground-truth joints No # 1
Multi-View or Monocular Monocular # 1

Methods used in the Paper


METHOD TYPE
ReLU
Activation Functions
1x1 Convolution
Convolutions
Average Pooling
Pooling Operations
Batch Normalization
Normalization
Residual Connection
Skip Connections
GAN
Generative Models
GRU
Recurrent Neural Networks
Max Pooling
Pooling Operations
Global Average Pooling
Pooling Operations
Bottleneck Residual Block
Skip Connection Blocks
Residual Block
Skip Connection Blocks
Kaiming Initialization
Initialization
Convolution
Convolutions
ResNet
Convolutional Neural Networks