AutoDropout automates the process of designing dropout patterns using a Transformer based controller. In this method, a controller learns to generate a dropout pattern at every channel and layer of a target network, such as a ConvNet or a Transformer. The target network is then trained with the dropped-out pattern, and its resulting validation performance is used as a signal for the controller to learn from. The resulting pattern is applied to a convolutional output channel, which is a common building block of image recognition models.
The controller network generates the tokens to describe the configurations of the dropout pattern. The tokens are generated like words in a language model. For every layer in a ConvNet, a group of 8 tokens need to be made to create a dropout pattern. These 8 tokens are generated sequentially. In the figure above, size, stride, and repeat indicate the size and the tiling of the pattern; rotate, shear_x, and shear_y specify the geometric transformations of the pattern; share_c is a binary deciding whether a pattern is applied to all $C$ channels; and residual is a binary deciding whether the pattern is applied to the residual branch as well. If we need $L$ dropout patterns, the controller will generate $8L$ decisions.
Source: AutoDropout: Learning Dropout Patterns to Regularize Deep NetworksPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Image Classification | 1 | 33.33% |
Language Modelling | 1 | 33.33% |
Machine Translation | 1 | 33.33% |