Incorporating Luminance, Depth and Color Information by a Fusion-based Network for Semantic Segmentation

24 Sep 2018  ·  Shang-Wei Hung, Shao-Yuan Lo, Hsueh-Ming Hang ·

Semantic segmentation has made encouraging progress due to the success of deep convolutional networks in recent years. Meanwhile, depth sensors become prevalent nowadays, so depth maps can be acquired more easily. However, there are few studies that focus on the RGB-D semantic segmentation task. Exploiting the depth information effectiveness to improve performance is a challenge. In this paper, we propose a novel solution named LDFNet, which incorporates Luminance, Depth and Color information by a fusion-based network. It includes a sub-network to process depth maps and employs luminance images to assist the depth information in processes. LDFNet outperforms the other state-of-art systems on the Cityscapes dataset, and its inference speed is faster than most of the existing networks. The experimental results show the effectiveness of the proposed multi-modal fusion network and its potential for practical applications.

PDF Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Semantic Segmentation Cityscapes test LDFNet Mean IoU (class) 71.3 # 73
Real-Time Semantic Segmentation Cityscapes test LDFNet mIoU 71.3% # 27
Frame (fps) 18.4 (1080Ti) # 22
Real-Time Semantic Segmentation Cityscapes val LDFNet mIoU 68.48% # 8
Semantic Segmentation Cityscapes val LDFNet mIoU 68.48% # 76

Methods