We present a novel approach for unsupervised learning of depth and ego-motion from monocular video.
The ability to predict depth from a single image - using recent advances in CNNs - is of increasing interest to the vision community.
To the best of our knowledge, this is the first work to show that deep networks trained using unlabelled monocular videos can predict globally scale-consistent camera trajectories over a long video sequence.
Ranked #16 on Monocular Depth Estimation on KITTI Eigen split
Despite learning based methods showing promising results in single view depth estimation and visual odometry, most existing approaches treat the tasks in a supervised manner.
We present an unsupervised learning framework for simultaneously training single-view depth prediction and optical flow estimation models using unlabeled video sequences.
In the occluded region, as depth and camera motion can provide more reliable motion estimation, they can be used to instruct unsupervised learning of optical flow.