3D hand reconstruction from images is a widely-studied problem in computer vision and graphics, and has a particularly high relevance for virtual and augmented reality.
We propose a novel transformer-based framework that reconstructs two high fidelity hands from multi-view RGB images.
Through this, we demonstrate the quality of our probabilistic reconstruction and show that explicit ambiguity modeling is better-suited for this challenging problem.
Moreover, we demonstrate that our approach offers previously unseen two-hand tracking performance from RGB, and quantitatively and qualitatively outperforms existing RGB-based methods that were not explicitly designed for two-hand interactions.
We present a novel method for real-time pose and shape reconstruction of two strongly interacting hands.
This paper introduces the first differentiable simulator of event streams, i. e., streams of asynchronous brightness change signals recorded by event cameras.
Due to the different data modality of event cameras compared to classical cameras, existing methods cannot be directly applied to and re-trained for event streams.
We propose to use a model-based generative loss for training hand pose estimators on depth images based on a volumetric hand model.
We consider the problem of inverse kinematics (IK), where one wants to find the parameters of a given kinematic skeleton that best explain a set of observed 3D joint locations.
4 code implementations • 1 Jul 2019 • Dushyant Mehta, Oleksandr Sotnychenko, Franziska Mueller, Weipeng Xu, Mohamed Elgharib, Pascal Fua, Hans-Peter Seidel, Helge Rhodin, Gerard Pons-Moll, Christian Theobalt
The first stage is a convolutional neural network (CNN) that estimates 2D and 3D pose features along with identity assignments for all visible joints of all individuals. We contribute a new architecture for this CNN, called SelecSLS Net, that uses novel selective long and short range skip connections to improve the information flow allowing for a drastically faster network without compromising accuracy.
Ranked #6 on 3D Multi-Person Pose Estimation on MuPoTS-3D
Our approach uses novel occlusion-robust pose-maps (ORPM) which enable full body pose inference even under strong partial occlusions by other people and objects in the scene.
Ranked #3 on 3D Multi-Person Pose Estimation (root-relative) on MuPoTS-3D (MPJPE metric)
We address the highly challenging problem of real-time 3D hand tracking based on a monocular RGB-only sequence.
We propose an automatic method for generating high-quality annotations for depth-based hand segmentation, and introduce a large-scale hand segmentation dataset.
We present an approach for real-time, robust and accurate hand pose estimation from moving egocentric RGB-D cameras in cluttered real environments.
However, due to difficult occlusions, fast motions, and uniform hand appearance, jointly tracking hand and object pose is more challenging than tracking either of the two separately.
In the optimization step, a novel objective function combines the detected part labels and a Gaussian mixture representation of the depth to estimate a pose that best fits the depth.