ASGD Weight-Dropped LSTM, or AWD-LSTM, is a type of recurrent neural network that employs DropConnect for regularization, as well as NT-ASGD for optimization - non-monotonically triggered averaged SGD - which returns an average of last iterations of weights. Additional regularization techniques employed include variable length backpropagation sequences, variational dropout, embedding dropout, weight tying, independent embedding/hidden size, activation regularization and temporal activation regularization.
Source: Regularizing and Optimizing LSTM Language ModelsPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Language Modelling | 18 | 17.14% |
General Classification | 14 | 13.33% |
Text Classification | 12 | 11.43% |
Sentiment Analysis | 8 | 7.62% |
Classification | 7 | 6.67% |
Language Identification | 4 | 3.81% |
Test | 4 | 3.81% |
Hate Speech Detection | 3 | 2.86% |
Machine Translation | 3 | 2.86% |