A Spatial Attention Module is a module for spatial attention in convolutional neural networks. It generates a spatial attention map by utilizing the interspatial relationship of features. Different from the channel attention, the spatial attention focuses on where is an informative part, which is complementary to the channel attention. To compute the spatial attention, we first apply averagepooling and maxpooling operations along the channel axis and concatenate them to generate an efficient feature descriptor. On the concatenated feature descriptor, we apply a convolution layer to generate a spatial attention map $\textbf{M}_{s}\left(F\right) \in \mathcal{R}^{H×W}$ which encodes where to emphasize or suppress.
We aggregate channel information of a feature map by using two pooling operations, generating two 2D maps: $\mathbf{F}^{s}_{avg} \in \mathbb{R}^{1\times{H}\times{W}}$ and $\mathbf{F}^{s}_{max} \in \mathbb{R}^{1\times{H}\times{W}}$. Each denotes averagepooled features and maxpooled features across the channel. Those are then concatenated and convolved by a standard convolution layer, producing the 2D spatial attention map. In short, the spatial attention is computed as:
$$ \textbf{M}_{s}\left(F\right) = \sigma\left(f^{7x7}\left(\left[\text{AvgPool}\left(F\right);\text{MaxPool}\left(F\right)\right]\right)\right) $$
$$ \textbf{M}_{s}\left(F\right) = \sigma\left(f^{7x7}\left(\left[\mathbf{F}^{s}_{avg};\mathbf{F}^{s}_{max} \right]\right)\right) $$
where $\sigma$ denotes the sigmoid function and $f^{7×7}$ represents a convolution operation with the filter size of 7 × 7.
Source: CBAM: Convolutional Block Attention ModulePaper  Code  Results  Date  Stars 

Task  Papers  Share 

Object Detection  51  18.55% 
Semantic Segmentation  15  5.45% 
Image Classification  12  4.36% 
RealTime Object Detection  8  2.91% 
Classification  5  1.82% 
SuperResolution  5  1.82% 
Domain Adaptation  5  1.82% 
Autonomous Driving  5  1.82% 
Image SuperResolution  4  1.45% 
Component  Type 


Average Pooling

Pooling Operations  
Convolution

Convolutions  
Max Pooling

Pooling Operations  
Sigmoid Activation

Activation Functions 