TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Language Modelling	WikiText-2	Melis et al. (2017) - 1-layer LSTM (tied)	Validation perplexity	69.3	# 24
Language Modelling	WikiText-2	Melis et al. (2017) - 1-layer LSTM (tied)	Test perplexity	65.9	# 32
Language Modelling	WikiText-2	Melis et al. (2017) - 1-layer LSTM (tied)	Number of params	24M	# 27

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/on-the-state-of-the-art-of-evaluation-in/language-modelling-on-wikitext-2)](https://paperswithcode.com/sota/language-modelling-on-wikitext-2?p=on-the-state-of-the-art-of-evaluation-in)`

On the State of the Art of Evaluation in Neural Language Models

ICLR 2018 · Gábor Melis, Chris Dyer, Phil Blunsom ·

Ongoing innovations in recurrent neural network architectures have provided a steady influx of apparently state-of-the-art results on language modelling benchmarks. However, these have been evaluated using differing code bases and limited computational resources, which represent uncontrolled sources of experimental variation. We reevaluate several popular architectures and regularisation methods with large-scale automatic black-box hyperparameter tuning and arrive at the somewhat surprising conclusion that standard LSTM architectures, when properly regularised, outperform more recent models. We establish a new state of the art on the Penn Treebank and Wikitext-2 corpora, as well as strong baselines on the Hutter Prize dataset.

PDF Abstract ICLR 2018 PDF ICLR 2018 Abstract

Code

Add Remove Mark official

deepmind/lamb

137

Tasks

Add Remove

Language Modelling

Datasets

Penn Treebank

WikiText-2

Results from the Paper

Edit

Ranked #32 on Language Modelling on WikiText-2

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Language Modelling	WikiText-2	Melis et al. (2017) - 1-layer LSTM (tied)	Validation perplexity	69.3	# 24	Compare
			Test perplexity	65.9	# 32	Compare
			Number of params	24M	# 27	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

On the State of the Art of Evaluation in Neural Language Models

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove