Recurrent Neural Networks

ASGD Weight-Dropped LSTM

Introduced by Merity et al. in Regularizing and Optimizing LSTM Language Models

ASGD Weight-Dropped LSTM, or AWD-LSTM, is a type of recurrent neural network that employs DropConnect for regularization, as well as NT-ASGD for optimization - non-monotonically triggered averaged SGD - which returns an average of last iterations of weights. Additional regularization techniques employed include variable length backpropagation sequences, variational dropout, embedding dropout, weight tying, independent embedding/hidden size, activation regularization and temporal activation regularization.

Source: Regularizing and Optimizing LSTM Language Models


Paper Code Results Date Stars


Task Papers Share
Language Modelling 18 17.14%
General Classification 14 13.33%
Text Classification 12 11.43%
Sentiment Analysis 8 7.62%
Classification 7 6.67%
Language Identification 4 3.81%
Test 4 3.81%
Hate Speech Detection 3 2.86%
Machine Translation 3 2.86%