TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Language Modelling	Penn Treebank (Word Level)	Gal & Ghahramani (2016) - Variational LSTM (large)	Validation perplexity	77.9	# 28
Language Modelling	Penn Treebank (Word Level)	Gal & Ghahramani (2016) - Variational LSTM (large)	Test perplexity	75.2	# 35
Language Modelling	Penn Treebank (Word Level)	Gal & Ghahramani (2016) - Variational LSTM (medium)	Validation perplexity	81.9	# 29
Language Modelling	Penn Treebank (Word Level)	Gal & Ghahramani (2016) - Variational LSTM (medium)	Test perplexity	79.7	# 38

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/a-theoretically-grounded-application-of/language-modelling-on-penn-treebank-word)](https://paperswithcode.com/sota/language-modelling-on-penn-treebank-word?p=a-theoretically-grounded-application-of)`

A Theoretically Grounded Application of Dropout in Recurrent Neural Networks

NeurIPS 2016 · Yarin Gal, Zoubin Ghahramani ·

Recurrent neural networks (RNNs) stand at the forefront of many recent developments in deep learning. Yet a major difficulty with these models is their tendency to overfit, with dropout shown to fail when applied to recurrent layers. Recent results at the intersection of Bayesian modelling and deep learning offer a Bayesian interpretation of common deep learning techniques such as dropout. This grounding of dropout in approximate Bayesian inference suggests an extension of the theoretical results, offering insights into the use of dropout with RNN models. We apply this new variational inference based dropout technique in LSTM and GRU models, assessing it on language modelling and sentiment analysis tasks. The new approach outperforms existing techniques, and to the best of our knowledge improves on the single model state-of-the-art in language modelling with the Penn Treebank (73.4 test perplexity). This extends our arsenal of variational tools in deep learning.

PDF Abstract NeurIPS 2016 PDF NeurIPS 2016 Abstract

Code

Add Remove Mark official

HKUST-KnowComp/R-Net

581

martin-gorner/tensorflow-rnn-shakes…

538

yaringal/BayesianRNN

375

jiahuei/COMIC-Towards-A-Compact-Ima…

jiahuei/COMIC-Compact-Image-Caption…

See all 15 implementations

Tasks

Add Remove

Bayesian Inference

Language Modelling

Sentiment Analysis

Variational Inference

Datasets

Penn Treebank

Results from the Paper

Edit

Ranked #35 on Language Modelling on Penn Treebank (Word Level)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Language Modelling	Penn Treebank (Word Level)	Gal & Ghahramani (2016) - Variational LSTM (large)	Validation perplexity	77.9	# 28	Compare
Language Modelling	Penn Treebank (Word Level)	Gal & Ghahramani (2016) - Variational LSTM (large)	Test perplexity	75.2	# 35	Compare
Language Modelling	Penn Treebank (Word Level)	Gal & Ghahramani (2016) - Variational LSTM (medium)	Validation perplexity	81.9	# 29	Compare
Language Modelling	Penn Treebank (Word Level)	Gal & Ghahramani (2016) - Variational LSTM (medium)	Test perplexity	79.7	# 38	Compare

Methods

Add Remove

Dropout • Embedding Dropout • GRU • LSTM • Sigmoid Activation • Tanh Activation • Variational Dropout

Edit Social Preview

A Theoretically Grounded Application of Dropout in Recurrent Neural Networks

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove