Spatial Attention Module (ThunderNet) Explained

Method Name:*

Method Full Name:*

Description with Markdown (optional):

**Spatial Attention Module (SAM)** is a feature extraction module for object detection used in [ThunderNet](https://paperswithcode.com/method/thundernet).

The ThunderNet SAM explicitly re-weights the feature map before RoI warping over the spatial dimensions. The key idea of SAM is to use the knowledge from [RPN](https://paperswithcode.com/method/rpn) to refine the feature distribution of the feature map. RPN is trained to recognize foreground regions under the supervision of ground truths. Therefore, the intermediate features in RPN can be used to distinguish foreground features from background features. SAM accepts two inputs: the intermediate feature map from RPN $\mathcal{F}^{RPN}$ and the thin feature map from the [Context Enhancement Module](https://paperswithcode.com/method/context-enhancement-module) $\mathcal{F}^{CEM}$. The output of SAM $\mathcal{F}^{SAM}$ is defined as:

$$ \mathcal{F}^{SAM} = \mathcal{F}^{CEM} * \text{sigmoid}\left(\theta\left(\mathcal{F}^{RPN}\right)\right) $$

Here $\theta\left(·\right)$ is a dimension transformation to match the number of channels in both feature maps. The sigmoid function is used to constrain the values within $\left[0, 1\right]$. At last, $\mathcal{F}^{CEM}$ is re-weighted by the generated feature map for better feature distribution. For computational efficiency, we simply apply a 1×1 [convolution](https://paperswithcode.com/method/convolution) as $\theta\left(·\right)$, so the computational cost of CEM is negligible. The Figure to the right shows the structure of SAM.

SAM has two functions. The first one is to refine the feature distribution by strengthening foreground features and suppressing background features. The second one is to stabilize the training of RPN as SAM enables extra gradient flow from [R-CNN](https://paperswithcode.com/method/r-cnn) subnet to RPN. As a result, RPN receives additional supervision from RCNN subnet, which helps the training of RPN.

Code Snippet URL (optional):

Image

Currently: methods/Screen_Shot_2020-06-30_at_11.33.22_PM_LFlLz3n.png Clear
Change:

Attached collections:

FEATURE EXTRACTORS

Add:

New collection name:

Top-level area:

Parent collection (if any):

Description (optional):

Task	Papers	Share
Mixed Reality	2	33.33%
Semantic Segmentation	2	33.33%
Object Detection	2	33.33%

Component	Type	Add Remove
1x1 Convolution	Convolutions
Batch Normalization	Normalization
Sigmoid Activation	Activation Functions

Spatial Attention Module (ThunderNet)

Papers

Tasks

Usage Over Time

Components

Categories

Add Remove