TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Semantic Segmentation	NYU Depth v2	MMAF-Net-152	Mean IoU	44.8%	# 82
Semantic Segmentation	Stanford2D3D - RGBD	MMAF-Net-152	mIoU	52.9	# 5
Semantic Segmentation	Stanford2D3D - RGBD	MMAF-Net-152	mAcc	62.3	# 2
Semantic Segmentation	Stanford2D3D - RGBD	MMAF-Net-152	Pixel Accuracy	76.5	# 4
Semantic Segmentation	SUN-RGBD	FSFNet	Mean IoU	47.0%	# 29

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/multi-modal-attention-based-fusion-model-for/semantic-segmentation-on-stanford2d3d-rgbd)](https://paperswithcode.com/sota/semantic-segmentation-on-stanford2d3d-rgbd?p=multi-modal-attention-based-fusion-model-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/multi-modal-attention-based-fusion-model-for/semantic-segmentation-on-sun-rgbd)](https://paperswithcode.com/sota/semantic-segmentation-on-sun-rgbd?p=multi-modal-attention-based-fusion-model-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/multi-modal-attention-based-fusion-model-for/semantic-segmentation-on-nyu-depth-v2)](https://paperswithcode.com/sota/semantic-segmentation-on-nyu-depth-v2?p=multi-modal-attention-based-fusion-model-for)`

Multi-Modal Attention-based Fusion Model for Semantic Segmentation of RGB-Depth Images

25 Dec 2019 · Fahimeh Fooladgar, Shohreh Kasaei ·

The 3D scene understanding is mainly considered as a crucial requirement in computer vision and robotics applications. One of the high-level tasks in 3D scene understanding is semantic segmentation of RGB-Depth images. With the availability of RGB-D cameras, it is desired to improve the accuracy of the scene understanding process by exploiting the depth features along with the appearance features. As depth images are independent of illumination, they can improve the quality of semantic labeling alongside RGB images. Consideration of both common and specific features of these two modalities improves the performance of semantic segmentation. One of the main problems in RGB-Depth semantic segmentation is how to fuse or combine these two modalities to achieve more advantages of each modality while being computationally efficient. Recently, the methods that encounter deep convolutional neural networks have reached the state-of-the-art results by early, late, and middle fusion strategies. In this paper, an efficient encoder-decoder model with the attention-based fusion block is proposed to integrate mutual influences between feature maps of these two modalities. This block explicitly extracts the interdependences among concatenated feature maps of these modalities to exploit more powerful feature maps from RGB-Depth images. The extensive experimental results on three main challenging datasets of NYU-V2, SUN RGB-D, and Stanford 2D-3D-Semantic show that the proposed network outperforms the state-of-the-art models with respect to computational cost as well as model size. Experimental results also illustrate the effectiveness of the proposed lightweight attention-based fusion model in terms of accuracy.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Scene Understanding

Segmentation

Semantic Segmentation

Datasets

NYUv2

SUN RGB-D

2D-3D-S

Results from the Paper

Edit

Ranked #5 on Semantic Segmentation on Stanford2D3D - RGBD

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Semantic Segmentation	NYU Depth v2	MMAF-Net-152	Mean IoU	44.8%	# 82	Compare
Semantic Segmentation	Stanford2D3D - RGBD	MMAF-Net-152	mIoU	52.9	# 5	Compare
			mAcc	62.3	# 2	Compare
			Pixel Accuracy	76.5	# 4	Compare
Semantic Segmentation	SUN-RGBD	FSFNet	Mean IoU	47.0%	# 29	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Multi-Modal Attention-based Fusion Model for Semantic Segmentation of RGB-Depth Images

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove