Image Data Augmentation


Introduced by Chen et al. in GridMask Data Augmentation

GridMask is a data augmentation method that randomly removes some pixels of an input image. Unlike other methods, the region that the algorithm removes is neither a continuous region nor random pixels in dropout. Instead, the algorithm removes a region with disconnected pixel sets, as shown in the Figure.

We express the setting as

$$ \tilde{\mathbf{x}}=\mathbf{x} \times M $$

where $\mathbf{x} \in R^{H \times W \times C}$ represents the input image, $M \in$ ${0,1}^{H \times W}$ is the binary mask that stores pixels to be removed, and $\tilde{\mathbf{x}} \in R^{H \times W \times C}$ is the result produced by the algorithm. For the binary mask $M$, if $M_{i, j}=1$ we keep pixel $(i, j)$ in the input image; otherwise we remove it. GridMask is applied after the image normalization operation.

The shape of $M$ looks like a grid, as shown in the Figure . Four numbers $\left(r, d, \delta_{x}, \delta_{y}\right)$ are used to represent a unique $M$. Every mask is formed by tiling the units. $r$ is the ratio of the shorter gray edge in a unit. $d$ is the length of one unit. $\delta_{x}$ and $\delta_{y}$ are the distances between the first intact unit and boundary of the image.

Source: GridMask Data Augmentation


Paper Code Results Date Stars


Task Papers Share
Semantic Segmentation 2 33.33%
Instance Segmentation 1 16.67%
Test 1 16.67%
Object Detection 1 16.67%
Reinforcement Learning (RL) 1 16.67%


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign