Regularization

Activation Regularization

Introduced by Merity et al. in Revisiting Activation Regularization for Language RNNs

Activation Regularization (AR), or $L_{2}$ activation regularization, is regularization performed on activations as opposed to weights. It is usually used in conjunction with RNNs. It is defined as:

$$\alpha{L}_{2}\left(m\circ{h_{t}}\right) $$

where $m$ is a dropout mask used by later parts of the model, $L_{2}$ is the $L_{2}$ norm, and $h_{t}$ is the output of an RNN at timestep $t$, and $\alpha$ is a scaling coefficient.

When applied to the output of a dense layer, AR penalizes activations that are substantially away from 0, encouraging activations to remain small.

Source: Revisiting Activation Regularization for Language RNNs

Papers


Paper Code Results Date Stars

Tasks


Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories