Depth-aware CNN for RGB-D Segmentation

ECCV 2018  ·  Weiyue Wang, Ulrich Neumann ·

Convolutional neural networks (CNN) are limited by the lack of capability to handle geometric information due to the fixed grid kernel structure. The availability of depth data enables progress in RGB-D semantic segmentation with CNNs. State-of-the-art methods either use depth as additional images or process spatial information in 3D volumes or point clouds. These methods suffer from high computation and memory cost. To address these issues, we present Depth-aware CNN by introducing two intuitive, flexible and effective operations: depth-aware convolution and depth-aware average pooling. By leveraging depth similarity between pixels in the process of information propagation, geometry is seamlessly incorporated into CNN. Without introducing any additional parameters, both operators can be easily integrated into existing CNNs. Extensive experiments and ablation studies on challenging RGB-D semantic segmentation benchmarks validate the effectiveness and flexibility of our approach.

PDF Abstract ECCV 2018 PDF ECCV 2018 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Thermal Image Segmentation MFN Dataset Depth-aware CNN mIOU 46.1 # 39
Semantic Segmentation NYU Depth v2 Depth-aware CNN Mean IoU 43.9% # 85
Semantic Segmentation Stanford2D3D - RGBD Depth-aware CNN mIoU 39.5 # 6
mAcc 55.5 # 3
Pixel Accuracy 65.4 # 5
Semantic Segmentation SUN-RGBD TokenFusion (S) Mean IoU 42.0% # 34