Regularization

Activation Regularization

Introduced by Merity et al. in Revisiting Activation Regularization for Language RNNs

Activation Regularization (AR), or $L_{2}$ activation regularization, is regularization performed on activations as opposed to weights. It is usually used in conjunction with RNNs. It is defined as:

$$\alpha{L}_{2}\left(m\circ{h_{t}}\right) $$

where $m$ is a dropout mask used by later parts of the model, $L_{2}$ is the $L_{2}$ norm, and $h_{t}$ is the output of an RNN at timestep $t$, and $\alpha$ is a scaling coefficient.

When applied to the output of a dense layer, AR penalizes activations that are substantially away from 0, encouraging activations to remain small.

Source: Revisiting Activation Regularization for Language RNNs

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Language Modelling 20 13.61%
Language Modeling 18 12.24%
Text Classification 15 10.20%
General Classification 14 9.52%
Sentiment Analysis 9 6.12%
Classification 8 5.44%
Language Identification 4 2.72%
Translation 4 2.72%
Decision Making 3 2.04%

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories