Fraternal Dropout

ICLR 2018 Konrad ZolnaDevansh ArpitDendi SuhubdyYoshua Bengio

Recurrent neural networks (RNNs) are important class of architectures among neural networks useful for language modeling and sequential prediction. However, optimizing RNNs is known to be harder compared to feed-forward neural networks... (read more)

PDF Abstract ICLR 2018 PDF ICLR 2018 Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Language Modelling Penn Treebank (Word Level) AWD-LSTM 3-layer with Fraternal dropout Validation perplexity 58.9 # 20
Test perplexity 56.8 # 25
Params 24M # 7
Language Modelling WikiText-2 AWD-LSTM 3-layer with Fraternal dropout Validation perplexity 66.8 # 16
Test perplexity 64.1 # 17
Number of params 34M # 7

Methods used in the Paper