Recurrent Neural Networks

ASGD Weight-Dropped LSTM

Introduced by Merity et al. in Regularizing and Optimizing LSTM Language Models

ASGD Weight-Dropped LSTM, or AWD-LSTM, is a type of recurrent neural network that employs DropConnect for regularization, as well as NT-ASGD for optimization - non-monotonically triggered averaged SGD - which returns an average of last iterations of weights. Additional regularization techniques employed include variable length backpropagation sequences, variational dropout, embedding dropout, weight tying, independent embedding/hidden size, activation regularization and temporal activation regularization.

Source: Regularizing and Optimizing LSTM Language Models


Paper Code Results Date Stars