In the occluded region, as depth and camera motion can provide more reliable motion estimation, they can be used to instruct unsupervised learning of optical flow.
In this work we introduce occlusion mask, a mask that during training can be used to specifically ignore regions that cannot be reconstructed due to occlusions.
To the best of our knowledge, this is the first work to show that deep networks trained using unlabelled monocular videos can predict globally scale-consistent camera trajectories over a long video sequence.
Ranked #16 on Monocular Depth Estimation on KITTI Eigen split
Dynamic scenes that contain both object motion and egomotion are a challenge for monocular visual odometry (VO).
We present an unsupervised learning framework for simultaneously training single-view depth prediction and optical flow estimation models using unlabeled video sequences.
Despite learning based methods showing promising results in single view depth estimation and visual odometry, most existing approaches treat the tasks in a supervised manner.
We present a novel approach for unsupervised learning of depth and ego-motion from monocular video.
The ability to predict depth from a single image - using recent advances in CNNs - is of increasing interest to the vision community.