RTFNet: RGB-Thermal Fusion Network for Semantic Segmentation of Urban Scenes

Semantic segmentation is a fundamental capability for autonomous vehicles. With the advancements of deep learning technologies, many effective semantic segmentation networks have been proposed in recent years. However, most of them are designed using RGB images from visible cameras. The quality of RGB images is prone to be degraded under unsatisfied lighting conditions, such as darkness and glares of oncoming headlights, which imposes critical challenges for the networks that use only RGB images. Different from visible cameras, thermal imaging cameras generate images using thermal radiations. They are able to see under various lighting conditions. In order to enable robust and accurate semantic segmentation for autonomous vehicles, we take the advantage of thermal images and fuse both the RGB and thermal information in a novel deep neural network. The main innovation of this letter is the architecture of the proposed network. We adopt the encoder–decoder design concept. ResNet is employed for feature extraction and a new decoder is developed to restore the feature map resolution. The experimental results prove that our network outperforms the state of the arts.

PDF Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Semantic Segmentation GAMUS RTFNet mIoU 58.26 # 4
Thermal Image Segmentation KP day-night RTFNet mIoU 28.7 # 4
Thermal Image Segmentation MFN Dataset RTFNet mIOU 53.2 # 31
Thermal Image Segmentation Noisy RS RGB-T Dataset RTFNet mIoU 48.5 # 5
Thermal Image Segmentation PST900 RTFNet mIoU 57.6 # 15
Thermal Image Segmentation RGB-T-Glass-Segmentation RTFNet MAE 0.058 # 14

Methods


No methods listed for this paper. Add relevant methods here