1 code implementation • 22 Oct 2020 • Ruofei Du, Eric Turner, Maksym Dzitsiuk, Luca Prasso, Ivo Duarte, Jason Dourgarian, Joao Afonso, Jose Pascoal, Josh Gladstone, Nuno Cruces, Shahram Izadi, Adarsh Kowdle, Konstantine Tsotsos, and David Kim
Slow adoption of depth information in the UX layer may be due to the complexity of processing depth data to simply render a mesh or detect interaction based on changes in the depth map.
Contrary to many recent neural network approaches that operate on a full cost volume and rely on 3D convolutions, our approach does not explicitly build a volume and instead relies on a fast multi-resolution initialization step, differentiable 2D geometric propagation and warping mechanisms to infer disparity hypotheses.
Ranked #2 on Stereo Depth Estimation on KITTI2015
Active speaker detection (ASD) and virtual cinematography (VC) can significantly improve the remote user experience of a video conference by automatically panning, tilting and zooming of a video conferencing camera: users subjectively rate an expert video cinematographer's video significantly higher than unedited video.
no code implementations • 12 Nov 2018 • Ricardo Martin-Brualla, Rohit Pandey, Shuoran Yang, Pavel Pidlypenskyi, Jonathan Taylor, Julien Valentin, Sameh Khamis, Philip Davidson, Anastasia Tkach, Peter Lincoln, Adarsh Kowdle, Christoph Rhemann, Dan B. Goldman, Cem Keskin, Steve Seitz, Shahram Izadi, Sean Fanello
We take the novel approach to augment such real-time performance capture systems with a deep architecture that takes a rendering from an arbitrary viewpoint, and jointly performs completion, super resolution, and denoising of the imagery in real-time.
A first estimate of the disparity is computed in a very low resolution cost volume, then hierarchically the model re-introduces high-frequency details through a learned upsampling function that uses compact pixel-to-pixel refinement networks.
Ranked #2 on Stereo Depth Estimation on sceneflow
In this paper we present ActiveStereoNet, the first deep learning solution for active stereo systems.
Numerous computer vision problems such as stereo depth estimation, object-class segmentation and foreground/background segmentation can be formulated as per-pixel image labeling tasks.
Efficient estimation of depth from pairs of stereo images is one of the core problems in computer vision.
We contribute an algorithm for solving this correspondence problem efficiently, without compromising depth accuracy.
We cast the problem of depth-layer segmentation as a discrete labeling problem on a spatiotemporal Markov Random Field (MRF) that uses the motion occlusion cues along with monocular cues and a smooth motion prior for the moving object.
In many machine learning domains (such as scene understanding), several related sub-tasks (such as scene categorization, depth estimation, object detection) operate on the same raw data and provide correlated outputs.