A Channel Attention Module is a module for channelbased attention in convolutional neural networks. We produce a channel attention map by exploiting the interchannel relationship of features. As each channel of a feature map is considered as a feature detector, channel attention focuses on ‘what’ is meaningful given an input image. To compute the channel attention efficiently, we squeeze the spatial dimension of the input feature map.
We first aggregate spatial information of a feature map by using both averagepooling and maxpooling operations, generating two different spatial context descriptors: $\mathbf{F}^{c}_{avg}$ and $\mathbf{F}^{c}_{max}$, which denote averagepooled features and maxpooled features respectively.
Both descriptors are then forwarded to a shared network to produce our channel attention map $\mathbf{M}_{c} \in \mathbb{R}^{C\times{1}\times{1}}$. Here $C$ is the number of channels. The shared network is composed of multilayer perceptron (MLP) with one hidden layer. To reduce parameter overhead, the hidden activation size is set to $\mathbb{R}^{C/r×1×1}$, where $r$ is the reduction ratio. After the shared network is applied to each descriptor, we merge the output feature vectors using elementwise summation. In short, the channel attention is computed as:
$$ \mathbf{M_{c}}\left(\mathbf{F}\right) = \sigma\left(\text{MLP}\left(\text{AvgPool}\left(\mathbf{F}\right)\right)+\text{MLP}\left(\text{MaxPool}\left(\mathbf{F}\right)\right)\right) $$
$$ \mathbf{M_{c}}\left(\mathbf{F}\right) = \sigma\left(\mathbf{W_{1}}\left(\mathbf{W_{0}}\left(\mathbf{F}^{c}_{avg}\right)\right) +\mathbf{W_{1}}\left(\mathbf{W_{0}}\left(\mathbf{F}^{c}_{max}\right)\right)\right) $$
where $\sigma$ denotes the sigmoid function, $\mathbf{W}_{0} \in \mathbb{R}^{C/r\times{C}}$, and $\mathbf{W}_{1} \in \mathbb{R}^{C\times{C/r}}$. Note that the MLP weights, $\mathbf{W}_{0}$ and $\mathbf{W}_{1}$, are shared for both inputs and the ReLU activation function is followed by $\mathbf{W}_{0}$.
Note that the channel attention module with just average pooling is the same as the SqueezeandExcitation Module.
Source: CBAM: Convolutional Block Attention ModulePaper  Code  Results  Date  Stars 

Task  Papers  Share 

Semantic Segmentation  7  7.45% 
Image Classification  7  7.45% 
Classification  4  4.26% 
General Classification  4  4.26% 
Lesion Segmentation  3  3.19% 
Image Segmentation  3  3.19% 
SuperResolution  3  3.19% 
Denoising  2  2.13% 
Anomaly Detection  2  2.13% 
Component  Type 


Average Pooling

Pooling Operations  
Dense Connections

Feedforward Networks  
Max Pooling

Pooling Operations  
ReLU

Activation Functions  
Sigmoid Activation

Activation Functions 