QuickPose: Real-time Multi-view Multi-person Pose Estimation in Crowded Scenes

This work proposes a real-time algorithm for reconstructing 3D human poses in crowded scenes from multiple calibrated views. The key challenge of this problem is to efficiently match 2D observations across multiple views. Previous methods perform multi-view matching either at the full-body level, which is sensitive to 2D pose estimation error, or at the part level, which ignores 2D constraints between different types of body parts in the same view. Instead, our approach reasons about all plausible skeleton proposals during multi-view matching, where each skeleton may consist of an arbitrary number of parts instead of being a whole body or a single part. To this end, we formulate the multi-view matching problem as mode seeking in the space of skeleton proposals and develop an efficient algorithm named QuickPose to solve the problem, which enables real-time motion capture in crowded scenes. Experiments show that the proposed algorithm achieves the state-of-the-art performance in terms of both speed and accuracy on public datasets.

PDF Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Benchmark
3D Multi-Person Pose Estimation Panoptic QuickPose Average MPJPE (mm) 20.0 # 8
3D Multi-Person Pose Estimation Shelf QuickPose PCP3D 98.1 # 2

Methods