Multimodal Material Segmentation

Recognition of materials from their visual appearance is essential for computer vision tasks, especially those that involve interaction with the real world. Material segmentation, i.e., dense per-pixel recognition of materials, remains challenging as, unlike objects, materials do not exhibit clearly discernible visual signatures in their regular RGB appearances. Different materials, however, do lead to different radiometric behaviors, which can often be captured with non-RGB imaging modalities. We realize multimodal material segmentation from RGB, polarization, and near-infrared images. We introduce the MCubeS dataset (from MultiModal Material Segmentation) which contains 500 sets of multimodal images capturing 42 street scenes. Ground truth material segmentation as well as semantic segmentation are annotated for every image and pixel. We also derive a novel deep neural network, MCubeSNet, which learns to focus on the most informative combinations of imaging modalities for each material class with a newly derived region-guided filter selection (RGFS) layer. We use semantic segmentation, as a prior to "guide" this filter selection. To the best of our knowledge, our work is the first comprehensive study on truly multimodal material segmentation. We believe our work opens new avenues of practical use of material information in safety critical applications.

PDF Abstract

Datasets


Introduced in the Paper:

MCubeS MCubeS (P)

Used in the Paper:

UPLight

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Semantic Segmentation MCubeS MCubeSNet (RGB-A-D-N) mIoU 42.86% # 11
Semantic Segmentation UPLight MCubeSNet (RGB-AoLP) mIoU 82.64 # 7
Semantic Segmentation UPLight MCubeSNet (RGB-DoLP) mIoU 80.80 # 8

Methods


No methods listed for this paper. Add relevant methods here