ASGD Weight-Dropped LSTM, or AWD-LSTM, is a type of recurrent neural network that employs DropConnect for regularization, as well as NT-ASGD for optimization - non-monotonically triggered averaged SGD - which returns an average of last iterations of weights. Additional regularization techniques employed include variable length backpropagation sequences, variational dropout, embedding dropout, weight tying, independent embedding/hidden size, activation regularization and temporal activation regularization.
Source: Regularizing and Optimizing LSTM Language ModelsPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Language Modelling | 19 | 17.12% |
General Classification | 14 | 12.61% |
Text Classification | 13 | 11.71% |
Classification | 8 | 7.21% |
Sentiment Analysis | 8 | 7.21% |
Language Identification | 4 | 3.60% |
Translation | 4 | 3.60% |
Hate Speech Detection | 3 | 2.70% |
Sentence | 3 | 2.70% |