Deformable Convolutional Networks

Introduced by Dai et al. in Deformable Convolutional Networks

Deformable ConvNets do not learn an affine transformation. They divide convolution into two steps, firstly sampling features on a regular grid $ \mathcal{R} $ from the input feature map, then aggregating sampled features by weighted summation using a convolution kernel. The process can be written as: \begin{align} Y(p_{0}) &= \sum_{p_i \in \mathcal{R}} w(p_{i}) X(p_{0} + p_{i}) \end{align} \begin{align} \mathcal{R} &= {(-1,-1), (-1, 0), \dots, (1, 1)} \end{align} The deformable convolution augments the sampling process by introducing a group of learnable offsets $\Delta p_{i}$ which can be generated by a lightweight CNN. Using the offsets $\Delta p_{i}$, the deformable convolution can be formulated as: \begin{align} Y(p_{0}) &= \sum_{p_i \in \mathcal{R}} w(p_{i}) X(p_{0} + p_{i} + \Delta p_{i}). \end{align} Through the above method, adaptive sampling is achieved. However, $\Delta p_{i}$ is a floating point value unsuited to grid sampling. To address this problem, bilinear interpolation is used. Deformable RoI pooling is also used, which greatly improves object detection.

Deformable ConvNets adaptively select the important regions and enlarge the valid receptive field of convolutional neural networks; this is important in object detection and semantic segmentation tasks.

Source: Deformable Convolutional Networks

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Object Detection	2	33.33%
Semantic Segmentation	2	33.33%
Instance Segmentation	1	16.67%
Vessel Detection	1	16.67%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Attention Mechanisms