Depth Estimation is a crucial step towards inferring scene geometry from 2D images. The goal in monocular Depth Estimation is to predict the depth value of each pixel, given only a single RGB image as input.
We present a novel approach for unsupervised learning of depth and ego-motion from monocular video.
Models and examples built with TensorFlow
Ranked #22 on
Monocular Depth Estimation
on KITTI Eigen split
We present a method for jointly training the estimation of depth, ego-motion, and a dense 3D translation field of objects relative to the scene, with monocular photometric consistency being the sole source of supervision.
Using our approach, existing monocular depth estimation techniques can be effectively applied to dual-pixel data, and much smaller models can be constructed that still infer high quality depth.
We present a novel method for simultaneous learning of depth, egomotion, object motion, and camera intrinsics from monocular videos, using only consistency across neighboring video frames as supervision signal.
We present a generalization of the Cauchy/Lorentzian, Geman-McClure, Welsch/Leclerc, generalized Charbonnier, Charbonnier/pseudo-Huber/L1-L2, and L2 loss functions.
Learning to navigate in complex environments with dynamic elements is an important milestone in developing AI agents.
Our method builds upon the idea of convolutional part heatmap regression [1], extending it for 3D face alignment.
Ranked #1 on
Face Alignment
on 3DFAW
Per-pixel ground-truth depth data is challenging to acquire at scale.
Per-pixel ground-truth depth data is challenging to acquire at scale.
Ranked #12 on
Monocular Depth Estimation
on KITTI Eigen split
IMAGE RECONSTRUCTION MOTION ESTIMATION SCENE UNDERSTANDING SELF-SUPERVISED LEARNING