T2Net: Synthetic-to-Realistic Translation for Solving Single-Image Depth Estimation Tasks

ECCV 2018  ·  Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai ·

Current methods for single-image depth estimation use training datasets with real image-depth pairs or stereo pairs, which are not easy to acquire. We propose a framework, trained on synthetic image-depth pairs and unpaired real images, that comprises an image translation network for enhancing realism of input images, followed by a depth prediction network. A key idea is having the first network act as a wide-spectrum input translator, taking in either synthetic or real images, and ideally producing minimally modified realistic images. This is done via a reconstruction loss when the training input is real, and GAN loss when synthetic, removing the need for heuristic self-regularization. The second network is trained on a task loss for synthetic image-depth pairs, with extra GAN loss to unify real and synthetic feature distributions. Importantly, the framework can be trained end-to-end, leading to good results, even surpassing early deep-learning methods that use real paired data.

PDF Abstract ECCV 2018 PDF ECCV 2018 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Depth Estimation DCM T2Net Abs Rel 0.351 # 3
Sq Rel 0.416 # 1
RMSE 1.117 # 3
RMSE log 0.415 # 1
Depth Estimation eBDtheque T2Net Abs Rel 0.491 # 3
Sq Rel 0.555 # 1
RMSE 1.459 # 3
RMSE log 0.777 # 1

Methods