Empirically, we show that our approach can accurately recover the 6-DoF object pose for both instance- and class-based scenarios even against a cluttered background.
For a future frame, we compute the similarity between the predicted state of a tracklet and the single frame observations in a probabilistic manner.
We find that 3D representations are more effective than 2D representations for tracking in these settings, and we obtain state-of-the-art performance.
The tools we develop open the door to processing and analyzing in 3D content from a large library of edited media, which could be helpful for many downstream applications.
Independent Sign Language Recognition is a complex visual recognition problem that combines several challenging tasks of Computer Vision due to the necessity to exploit and fuse information from hand gestures, body features and facial expressions.
To understand how people look, interact, or perform tasks, we need to quickly and accurately capture their 3D body, face, and hands together from an RGB image.
Our goal is to train a single network that learns to avoid these problems and generate a coherent 3D reconstruction of all the humans in the scene.
Ranked #1 on 3D Human Reconstruction on AGORA
This paper solves the planar navigation problem by recourse to an online reactive scheme that exploits recent advances in SLAM and visual object recognition to recast prior geometric knowledge in terms of an offline catalogue of familiar objects.
Assuming that the texture of the person does not change dramatically between frames, we can apply a novel texture consistency loss, which enforces that each point in the texture map has the same texture value across all frames.
Ranked #14 on Weakly-supervised 3D Human Pose Estimation on Human3.6M
Our approach is self-improving by nature, since better network estimates can lead the optimization to better solutions, while more accurate optimization fits provide better supervision for the network.
Ranked #2 on 3D Human Pose Estimation on MPI-INF-3DHP
Image-based features are attached to the mesh vertices and the Graph-CNN is responsible to process them on the mesh structure, while the regression target for each vertex is its 3D location.
Ranked #30 on Monocular 3D Human Pose Estimation on Human3.6M
We use the new method, SMPLify-X, to fit SMPL-X to both controlled images and images in the wild.
Ranked #3 on 3D Human Reconstruction on AGORA
The proposed approach outperforms previous baselines on this task and offers an attractive solution for direct prediction of 3D shape from a single color image.
Ranked #61 on 3D Human Pose Estimation on Human3.6M
This information can be acquired by human annotators for a wide range of images and poses.
Ranked #31 on Monocular 3D Human Pose Estimation on Human3.6M
Current motion capture (MoCap) systems generally require markers and multiple calibrated cameras, which can be used only in constrained environments.
In this paper, we present a geometry-driven approach to automatically collect annotations for human pose prediction tasks.
Ranked #15 on Weakly-supervised 3D Human Pose Estimation on Human3.6M
This paper presents a novel approach to estimating the continuous six degree of freedom (6-DoF) pose (3D translation and rotation) of an object from a single RGB image.
Ranked #1 on Keypoint Detection on Pascal3D+
Recovering 3D full-body human pose is a challenging problem with many applications.
This paper addresses the challenge of 3D human pose estimation from a single color image.
Ranked #6 on 3D Human Pose Estimation on HumanEva-I