Strip Pooling Network

Introduced by Hou et al. in Strip Pooling: Rethinking Spatial Pooling for Scene Parsing

Spatial pooling usually operates on a small region which limits its capability to capture long-range dependencies and focus on distant regions. To overcome this, Hou et al. proposed strip pooling, a novel pooling method capable of encoding long-range context in either horizontal or vertical spatial domains.

Strip pooling has two branches for horizontal and vertical strip pooling. The horizontal strip pooling part first pools the input feature $F \in \mathcal{R}^{C \times H \times W}$ in the horizontal direction: \begin{align} y^1 = \text{GAP}^w (X) \end{align} Then a 1D convolution with kernel size 3 is applied in $y$ to capture the relationship between different rows and channels. This is repeated $W$ times to make the output $y_v$ consistent with the input shape: \begin{align} y_h = \text{Expand}(\text{Conv1D}(y^1)) \end{align} Vertical strip pooling is performed in a similar way. Finally, the outputs of the two branches are fused using element-wise summation to produce the attention map: \begin{align} s &= \sigma(Conv^{1\times 1}(y_{v} + y_{h})) \end{align} \begin{align} Y &= s X \end{align}

The strip pooling module (SPM) is further developed in the mixed pooling module (MPM). Both consider spatial and channel relationships to overcome the locality of convolutional neural networks. SPNet achieves state-of-the-art results for several complex semantic segmentation benchmarks.

Source: Strip Pooling: Rethinking Spatial Pooling for Scene Parsing

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Semantic Segmentation	2	28.57%
3D Object Detection	1	14.29%
Autonomous Driving	1	14.29%
Object Detection	1	14.29%
Retinal Vessel Segmentation	1	14.29%
Scene Parsing	1	14.29%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Attention Mechanisms