Image Model Blocks

Dimension-wise Fusion

Introduced by Mehta et al. in DiCENet: Dimension-wise Convolutions for Efficient Networks

Dimension-wise Fusion is an image model block that attempts to capture global information by combining features globally. It is an alternative to point-wise convolution. A point-wise convolutional layer applies $D$ point-wise kernels $\mathbf{k}_p \in \mathbb{R}^{3D \times 1 \times 1}$ and performs $3D^2HW$ operations to combine dimension-wise representations of $\mathbf{Y_{Dim}} \in \mathbb{R}^{3D \times H \times W}$ and produce an output $\mathbf{Y} \in \mathbb{R}^{D \times H \times W}$. This is computationally expensive. Dimension-wise fusion is an alternative that can allow us to combine representations of $\mathbf{Y_{Dim}}$ efficiently. As illustrated in the Figure to the right, it factorizes the point-wise convolution in two steps: (1) local fusion and (2) global fusion.

Source: DiCENet: Dimension-wise Convolutions for Efficient Networks

Papers


Paper Code Results Date Stars

Categories