TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
RGB-D Salient Object Detection	NJU2K	S2MA	S-Measure	89.4	# 19
RGB-D Salient Object Detection	NJU2K	S2MA	Average MAE	0.053	# 22
RGB-D Salient Object Detection	NJU2K	S2MA	max E-Measure	92.7	# 8
RGB-D Salient Object Detection	NJU2K	S2MA	max F-Measure	88.9	# 10

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learning-selective-self-mutual-attention-for/rgb-d-salient-object-detection-on-nju2k)](https://paperswithcode.com/sota/rgb-d-salient-object-detection-on-nju2k?p=learning-selective-self-mutual-attention-for)`

Learning Selective Self-Mutual Attention for RGB-D Saliency Detection

CVPR 2020 · Nian Liu, Ni Zhang, Junwei Han ·

Saliency detection on RGB-D images is receiving more and more research interests recently. Previous models adopt the early fusion or the result fusion scheme to fuse the input RGB and depth data or their saliency maps, which incur the problem of distribution gap or information loss. Some other models use the feature fusion scheme but are limited by the linear feature fusion methods. In this paper, we propose to fuse attention learned in both modalities. Inspired by the Non-local model, we integrate the self-attention and each other's attention to propagate long-range contextual dependencies, thus incorporating multi-modal information to learn attention and propagate contexts more accurately. Considering the reliability of the other modality's attention, we further propose a selection attention to weight the newly added attention term. We embed the proposed attention module in a two-stream CNN for RGB-D saliency detection. Furthermore, we also propose a residual fusion module to fuse the depth decoder features into the RGB stream. Experimental results on seven benchmark datasets demonstrate the effectiveness of the proposed model components and our final saliency model. Our code and saliency maps are available at https://github.com/nnizhang/S2MA.

PDF Abstract