A Simple Pooling-Based Design for Real-Time Salient Object Detection

We solve the problem of salient object detection by investigating how to expand the role of pooling in convolutional neural networks. Based on the U-shape architecture, we first build a global guidance module (GGM) upon the bottom-up pathway, aiming at providing layers at different feature levels the location information of potential salient objects. We further design a feature aggregation module (FAM) to make the coarse-level semantic information well fused with the fine-level features from the top-down pathway. By adding FAMs after the fusion operations in the top-down pathway, coarse-level features from the GGM can be seamlessly merged with features at various scales. These two pooling-based modules allow the high-level semantic features to be progressively refined, yielding detail enriched saliency maps. Experiment results show that our proposed approach can more accurately locate the salient objects with sharpened details and hence substantially improve the performance compared to the previous state-of-the-arts. Our approach is fast as well and can run at a speed of more than 30 FPS when processing a $300 \times 400$ image. Code can be found at http://mmcheng.net/poolnet/.

PDF Abstract CVPR 2019 PDF CVPR 2019 Abstract

Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
RGB Salient Object Detection DUT-OMRON PoolNet (VGG-16) MAE 0.053 # 7
F-measure 0.833 # 2
RGB Salient Object Detection DUTS-TE PoolNet (VGG-16) MAE 0.036 # 10
F-measure 0.892 # 5
RGB Salient Object Detection ECSSD PoolNet (VGG-16) MAE 0.038 # 9
F-measure 0.945 # 4
RGB Salient Object Detection HKU-IS PoolNet (VGG-16) MAE 0.03 # 8
F-measure 0.935 # 4
RGB Salient Object Detection PASCAL-S PoolNet (VGG-16) MAE 0.065 # 5
F-measure 0.88 # 3
RGB Salient Object Detection SOD PoolNet (VGG-16) MAE 0.102 # 1
F-measure 0.882 # 1