Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection

22 Jul 2022  ·  Hang Ye, Wentao Zhu, Chunyu Wang, Rujie Wu, Yizhou Wang ·

While the voxel-based methods have achieved promising results for multi-person 3D pose estimation from multi-cameras, they suffer from heavy computation burdens, especially for large scenes. We present Faster VoxelPose to address the challenge by re-projecting the feature volume to the three two-dimensional coordinate planes and estimating X, Y, Z coordinates from them separately. To that end, we first localize each person by a 3D bounding box by estimating a 2D box and its height based on the volume features projected to the xy-plane and z-axis, respectively. Then for each person, we estimate partial joint coordinates from the three coordinate planes separately which are then fused to obtain the final 3D pose. The method is free from costly 3D-CNNs and improves the speed of VoxelPose by ten times and meanwhile achieves competitive accuracy as the state-of-the-art methods, proving its potential in real-time applications.

PDF Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
3D Multi-Person Pose Estimation Campus Faster VoxelPose PCP3D 96.9 # 5
3D Multi-Person Pose Estimation Panoptic Faster VoxelPose Average MPJPE (mm) 18.41 # 6
3D Multi-Person Pose Estimation Shelf Faster VoxelPose PCP3D 97.6 # 8

Methods