Self-Supervised Monocular Depth Hints

Monocular depth estimators can be trained with various forms of self-supervision from binocular-stereo data to circumvent the need for high-quality laser scans or other ground-truth data. The disadvantage, however, is that the photometric reprojection losses used with self-supervised learning typically have multiple local minima. These plausible-looking alternatives to ground truth can restrict what a regression network learns, causing it to predict depth maps of limited quality. As one prominent example, depth discontinuities around thin structures are often incorrectly estimated by current state-of-the-art methods. Here, we study the problem of ambiguous reprojections in depth prediction from stereo-based self-supervision, and introduce Depth Hints to alleviate their effects. Depth Hints are complementary depth suggestions obtained from simple off-the-shelf stereo algorithms. These hints enhance an existing photometric loss function, and are used to guide a network to learn better weights. They require no additional data, and are assumed to be right only sometimes. We show that using our Depth Hints gives a substantial boost when training several leading self-supervised-from-stereo models, not just our own. Further, combined with other good practices, we produce state-of-the-art depth predictions on the KITTI benchmark.

PDF Abstract ICCV 2019 PDF ICCV 2019 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Monocular Depth Estimation KITTI Eigen split Depth Hints absolute relative error 0.096 # 44
Monocular Depth Estimation VA (Virtual Apartment) Depth Hints Root mean square error (RMSE) 0.427 # 2
Log root mean square error (RMSE_log) 0.248 # 2
Mean average error (MAE) 0.291 # 2
Absolute relative error (AbsRel) 0.197 # 2

Methods


No methods listed for this paper. Add relevant methods here