Activation Regularization (AR), or $L_{2}$ activation regularization, is regularization performed on activations as opposed to weights. It is usually used in conjunction with RNNs. It is defined as:
$$\alpha{L}_{2}\left(m\circ{h_{t}}\right) $$
where $m$ is a dropout mask used by later parts of the model, $L_{2}$ is the $L_{2}$ norm, and $h_{t}$ is the output of an RNN at timestep $t$, and $\alpha$ is a scaling coefficient.
When applied to the output of a dense layer, AR penalizes activations that are substantially away from 0, encouraging activations to remain small.
Source: Revisiting Activation Regularization for Language RNNsPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Language Modelling | 20 | 17.86% |
General Classification | 14 | 12.50% |
Text Classification | 13 | 11.61% |
Classification | 8 | 7.14% |
Sentiment Analysis | 8 | 7.14% |
Language Identification | 4 | 3.57% |
Translation | 4 | 3.57% |
Hate Speech Detection | 3 | 2.68% |
Sentence | 3 | 2.68% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |