Convolutions

Dimension-wise Convolution

Introduced by Mehta et al. in DiCENet: Dimension-wise Convolutions for Efficient Networks

A Dimension-wise Convolution, or DimConv, is a type of convolution that can encode depth-wise, width-wise, and height-wise information independently. To achieve this, DimConv extends depthwise convolutions to all dimensions of the input tensor $X \in \mathbb{R}^{D\times{H}\times{W}}$, where $W$, $H$, and $D$ corresponds to width, height, and depth of $X$. DimConv has three branches, one branch per dimension. These branches apply $D$ depth-wise convolutional kernels $k_{D} \in \mathbb{R}^{1\times{n}\times{n}}$ along depth, $W$ width-wise convolutional kernels $k_{W} \in \mathbb{R}^{n\times{1}\times{1}}$ along width, and $H$ height-wise convolutional kernels $k_{H} \in \mathbb{R}^{n\times{1}\times{n}}$ kernels along height to produce outputs $Y_{D}$, $Y_{W}$, and $Y_{H} \in \mathbb{R}^{D\times{H}\times{W}}$ that encode information from all dimensions of the input tensor. The outputs of these independent branches are concatenated along the depth dimension, such that the first spatial plane of $Y_{D}$, $Y_{W}$, and $Y_{H}$ are put together and so on, to produce the output $Y_{Dim} = ${$Y_{D}$, $Y_{W}$, $Y_{H}$} $\in \mathbb{R}^{3D\times{H}\times{W}}$.

Source: DiCENet: Dimension-wise Convolutions for Efficient Networks

Papers


Paper Code Results Date Stars

Categories