Pyramid Grafting Network for One-Stage High Resolution Saliency Detection

Recent salient object detection (SOD) methods based on deep neural network have achieved remarkable performance. However, most of existing SOD models designed for low-resolution input perform poorly on high-resolution images due to the contradiction between the sampling depth and the receptive field size. Aiming at resolving this contradiction, we propose a novel one-stage framework called Pyramid Grafting Network (PGNet), using transformer and CNN backbone to extract features from different resolution images independently and then graft the features from transformer branch to CNN branch. An attention-based Cross-Model Grafting Module (CMGM) is proposed to enable CNN branch to combine broken detailed information more holistically, guided by different source feature during decoding process. Moreover, we design an Attention Guided Loss (AGL) to explicitly supervise the attention matrix generated by CMGM to help the network better interact with the attention from different models. We contribute a new Ultra-High-Resolution Saliency Detection dataset UHRSD, containing 5,920 images at 4K-8K resolutions. To our knowledge, it is the largest dataset in both quantity and resolution for high-resolution SOD task, which can be used for training and testing in future research. Sufficient experiments on UHRSD and widely-used SOD datasets demonstrate that our method achieves superior performance compared to the state-of-the-art methods.

PDF Abstract CVPR 2022 PDF CVPR 2022 Abstract

Datasets


Introduced in the Paper:

UHRSD

Used in the Paper:

DUTS DUT-OMRON HRSOD DAVIS-S

Results from the Paper


Ranked #5 on RGB Salient Object Detection on UHRSD (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
RGB Salient Object Detection DAVIS-S PGNet (HRSOD, UHRSD) S-measure 0.954 # 5
F-measure 0.956 # 5
mBA 0.730 # 4
MAE 0.010 # 5
RGB Salient Object Detection DAVIS-S PGNet (DUTS, HRSOD) S-measure 0.947 # 6
F-measure 0.948 # 6
mBA 0.716 # 5
MAE 0.012 # 6
RGB Salient Object Detection DAVIS-S PGNet S-measure 0.935 # 8
F-measure 0.931 # 9
mBA 0.707 # 7
MAE 0.015 # 9
RGB Salient Object Detection HRSOD PGNet S-Measure 0.930 # 8
max F-Measure 0.922 # 8
MAE 0.021 # 7
mBA 0.693 # 6
RGB Salient Object Detection HRSOD PGNet (HRSOD, UHRSD) S-Measure 0.938 # 6
max F-Measure 0.939 # 5
MAE 0.020 # 5
mBA 0.727 # 4
RGB Salient Object Detection HRSOD PGNet (DUTS, HRSOD) S-Measure 0.935 # 7
max F-Measure 0.929 # 7
MAE 0.020 # 5
mBA 0.714 # 5
RGB Salient Object Detection UHRSD PGNet S-Measure 0.912 # 7
max F-Measure 0.914 # 8
MAE 0.037 # 8
mBA 0.715 # 6
RGB Salient Object Detection UHRSD PGNet (DUTS, HRSOD) S-Measure 0.912 # 7
max F-Measure 0.915 # 7
MAE 0.036 # 7
mBA 0.735 # 5
RGB Salient Object Detection UHRSD PGNet (HRSOD, UHRSD) S-Measure 0.935 # 4
max F-Measure 0.930 # 5
MAE 0.026 # 3
mBA 0.765 # 3

Methods