The extremely low computational cost of lightweight CNNs constrains the depth and width of the networks, further decreasing their representational power. To address the above problem, Chen et al. proposed dynamic convolution, a novel operator design that increases representational power with negligible additional computational cost and does not change the width or depth of the network in parallel with CondConv.
Dynamic convolution uses $K$ parallel convolution kernels of the same size and input/output dimensions instead of one kernel per layer. Like SE blocks, it adopts a squeezeandexcitation mechanism to generate the attention weights for the different convolution kernels. These kernels are then aggregated dynamically by weighted summation and applied to the input feature map $X$: \begin{align} s & = \text{softmax} (W_{2} \delta (W_{1}\text{GAP}(X))) \end{align} \begin{align} \text{DyConv} &= \sum_{i=1}^{K} s_k \text{Conv}_k \end{align} \begin{align} Y &= \text{DyConv}(X) \end{align} Here the convolutions are combined by summation of weights and biases of convolutional kernels.
Compared to applying convolution to the feature map, the computational cost of squeezeandexcitation and weighted summation is extremely low. Dynamic convolution thus provides an efficient operation to improve representational power and can be easily used as a replacement for any convolution.
Source: Dynamic Convolution: Attention over Convolution KernelsPaper  Code  Results  Date  Stars 

Task  Papers  Share 

Semantic Segmentation  11  9.02% 
Object Detection  8  6.56% 
Sound Event Detection  7  5.74% 
Instance Segmentation  5  4.10% 
Panoptic Segmentation  5  4.10% 
Image Classification  5  4.10% 
Depth Estimation  4  3.28% 
Decoder  3  2.46% 
Autonomous Driving  3  2.46% 
Component  Type 


🤖 No Components Found  You can add them if they exist; e.g. Mask RCNN uses RoIAlign 