3D Shape Reconstruction from Videos
4 papers with code • 0 benchmarks • 0 datasets
These leaderboards are used to track progress in 3D Shape Reconstruction from Videos
The surface embeddings are implemented as coordinate-based MLPs that are fit to each video via consistency and contrastive reconstruction losses. Experimental results show that ViSER compares favorably against prior work on challenging videos of humans with loose clothing and unusual poses as well as animals videos from DAVIS and YTVOS.
Our key insight is to merge three schools of thought; (1) classic deformable shape models that make use of articulated bones and blend skinning, (2) volumetric neural radiance fields (NeRFs) that are amenable to gradient-based optimization, and (3) canonical embeddings that generate correspondences between pixels and an articulated model.
This paper proposes a simple baseline framework for video-based 2D/3D human pose estimation that can achieve 10 times efficiency improvement over existing works without any performance degradation, named DeciWatch.