Image Model Blocks

Spatial Feature Transform, or SFT, is a layer that generates affine transformation parameters for spatial-wise feature modulation, and was originally proposed within the context of image super-resolution. A Spatial Feature Transform (SFT) layer learns a mapping function $\mathcal{M}$ that outputs a modulation parameter pair $(\mathbf{\gamma}, \mathbf{\beta})$ based on some prior condition $\Psi$. The learned parameter pair adaptively influences the outputs by applying an affine transformation spatially to each intermediate feature maps in an SR network. During testing, only a single forward pass is needed to generate the HR image given the LR input and segmentation probability maps.

More precisely, the prior $\Psi$ is modeled by a pair of affine transformation parameters $(\mathbf{\gamma}, \mathbf{\beta})$ through a mapping function $\mathcal{M}: \Psi \mapsto(\mathbf{\gamma}, \mathbf{\beta})$. Consequently,

$$ \hat{\mathbf{y}}=G_{\mathbf{\theta}}(\mathbf{x} \mid \mathbf{\gamma}, \mathbf{\beta}), \quad(\mathbf{\gamma}, \mathbf{\beta})=\mathcal{M}(\Psi) $$

After obtaining $(\mathbf{\gamma}, \mathbf{\beta})$ from conditions, the transformation is carried out by scaling and shifting feature maps of a specific layer:

$$ \operatorname{SFT}(\mathbf{F} \mid \mathbf{\gamma}, \mathbf{\beta})=\mathbf{\gamma} \odot \mathbf{F}+\mathbf{\beta} $$

where $\mathbf{F}$ denotes the feature maps, whose dimension is the same as $\gamma$ and $\mathbf{\beta}$, and $\odot$ is referred to element-wise multiplication, i.e., Hadamard product. Since the spatial dimensions are preserved, the SFT layer not only performs feature-wise manipulation but also spatial-wise transformation.

Source: Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform

Papers


Paper Code Results Date Stars

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories