Monocular 3D human pose and shape estimation is an ill-posed problem since multiple 3D solutions can explain a 2D image of a subject.
Ranked #51 on 3D Human Pose Estimation on 3DPW (MPJPE metric)
Previous methods solve feature matching and pose estimation using a two-stage process by first finding matches and then estimating the pose.
Reconstructing the 3D shape of an object using several images under different light sources is a very challenging task, especially when realistic assumptions such as light propagation and attenuation, perspective viewing geometry and specular light reflection are considered.
The depth of each pixel can be propagated to a query pixel, using the predicted surface normal as guidance.
Ranked #23 on Monocular Depth Estimation on NYU-Depth V2
This combined information is the input to a pose prediction network, SPARC-Net which we train to predict a 9 DoF CAD model pose update.
Hierarchical frameworks consisting of both coarse and fine localization are often used as the standard pipeline for large-scale visual localization.
To this end, we propose MaGNet, a novel framework for fusing single-view depth probability with multi-view geometry, to improve the accuracy, robustness and efficiency of multi-view depth estimation.
This paper addresses the problem of 3D human body shape and pose estimation from RGB images.
In this work we demonstrate how cross-domain keypoint matches from an RGB image to a rendered CAD model allow for more precise object pose predictions compared to ones obtained through direct predictions.
Thus, it is desirable to estimate a distribution over 3D body shape and pose conditioned on the input image instead of a single 3D reconstruction.
Ranked #1 on 3D Human Shape Estimation on SSP-3D
Experimental results show that the proposed method outperforms the state-of-the-art in ScanNet and NYUv2, and that the estimated uncertainty correlates well with the prediction error.
Ranked #1 on Surface Normals Estimation on ScanNetV2
In order to fill the gap in evaluating near-field photometric stereo methods, we introduce LUCES the first real-world 'dataset for near-fieLd point light soUrCe photomEtric Stereo' of 14 objects of a varying of materials.
In contrast, we propose a new task: shape and pose estimation from a group of multiple images of a human subject, without constraints on subject pose, camera viewpoint or background conditions between images in the group.
Ranked #3 on 3D Human Shape Estimation on SSP-3D
Thus, we propose STRAPS (Synthetic Training for Real Accurate Pose and Shape), a system that utilises proxy representations, such as silhouettes and 2D joints, as inputs to a shape and pose regression neural network, which is trained with synthetic training data (generated on-the-fly during training using the SMPL statistical body model) to overcome data scarcity.
Ranked #1 on 3D Human Shape Estimation on MoVi
Secondly, we compute the depth by integrating the normal field in order to iteratively estimate light directions and attenuation which is used to compensate the input images to compute reflectance samples for the next iteration.
We show that global physical effects can be approximated on the observation map domain and this simplifies and speeds up the data creation procedure.
We obtain smaller mean distance and angular errors than state-of-the-art 6-DoF pose estimation algorithms based on direct pose regression and pose estimation from scene coordinates on all datasets.