RGB-D Salient Object Detection
56 papers with code • 8 benchmarks • 5 datasets
RGB-D Salient object detection (SOD) aims at distinguishing the most visually distinctive objects or regions in a scene from the given RGB and Depth data. It has a wide range of applications, including video/image segmentation, object recognition, visual tracking, foreground maps evaluation, image retrieval, content-aware image editing, information discovery, photosynthesis, and weakly supervised semantic segmentation. Here, depth information plays an important complementary role in finding salient objects. Online benchmark: http://dpfan.net/d3netbenchmark.
( Image credit: Rethinking RGB-D Salient Object Detection: Models, Data Sets, and Large-Scale Benchmarks, TNNLS20 )
Benchmarks
These leaderboards are used to track progress in RGB-D Salient Object Detection
Libraries
Use these libraries to find RGB-D Salient Object Detection models and implementationsLatest papers
DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation
We present DFormer, a novel RGB-D pretraining framework to learn transferable representations for RGB-D segmentation tasks.
Point-aware Interaction and CNN-induced Refinement Network for RGB-D Salient Object Detection
By integrating complementary information from RGB image and depth map, the ability of salient object detection (SOD) for complex and challenging scenes can be improved.
Mutual Information Regularization for Weakly-supervised RGB-D Salient Object Detection
In particular, following the principle of disentangled representation learning, we introduce a mutual information upper bound with a mutual information minimization regularizer to encourage the disentangled representation of each modality for salient object detection.
CIR-Net: Cross-modality Interaction and Refinement for RGB-D Salient Object Detection
Focusing on the issue of how to effectively capture and utilize cross-modality information in RGB-D salient object detection (SOD) task, we present a convolutional neural network (CNN) model, named CIR-Net, based on the novel cross-modality interaction and refinement.
Depth Quality-Inspired Feature Manipulation for Efficient RGB-D and Video Salient Object Detection
Inspired by the fact that depth quality is a key factor influencing the accuracy, we propose an efficient depth quality-inspired feature manipulation (DQFM) process, which can dynamically filter depth features according to depth quality.
SPSN: Superpixel Prototype Sampling Network for RGB-D Salient Object Detection
However, despite advances in deep learning-based methods, RGB-D SOD is still challenging due to the large domain gap between an RGB image and the depth map and low-quality depth maps.
TANet: Transformer-based Asymmetric Network for RGB-D Salient Object Detection
We employ the powerful feature extraction capability of Transformer (PVTv2) to extract global semantic information from RGB data and design a lightweight CNN backbone (LWDepthNet) to extract spatial structure information from depth data without pre-training.
Promoting Saliency From Depth: Deep Unsupervised RGB-D Saliency Detection
The laborious and time-consuming manual annotation has become a real bottleneck in various practical scenarios.
An Energy-Based Prior for Generative Saliency
We propose a novel generative saliency prediction framework that adopts an informative energy-based model as a prior distribution.
Joint Learning of Salient Object Detection, Depth Estimation and Contour Extraction
In this paper, we propose a novel multi-task and multi-modal filtered transformer (MMFT) network for RGB-D salient object detection (SOD).