Lightweight Multi-person Total Motion Capture Using Sparse Multi-view Cameras

Multi-person total motion capture is extremely challenging when it comes to handle severe occlusions, different reconstruction granularities from body to face and hands, drastically changing observation scales and fast body movements. To overcome these challenges above, we contribute a lightweight total motion capture system for multi-person interactive scenarios using only sparse multi-view cameras. By contributing a novel hand and face bootstrapping algorithm, our method is capable of efficient localization and accurate association of the hands and faces even on severe occluded occasions. We leverage both pose regression and keypoints detection methods and further propose a unified two-stage parametric fitting method for achieving pixel-aligned accuracy. Moreover, for extremely self-occluded poses and close interactions, a novel feedback mechanism is proposed to propagate the pixel-aligned reconstructions into the next frame for more accurate association. Overall, we propose the first light-weight total capture system and achieves fast, robust and accurate multi-person total motion capture performance. The results and experiments show that our method achieves more accurate results than existing methods under sparse-view setups.

PDF Abstract ICCV 2021 PDF ICCV 2021 Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
3D Multi-Person Pose Estimation Shelf Zhang et al. PCP3D 98.1 # 2

Methods


No methods listed for this paper. Add relevant methods here