TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Language Modelling	Penn Treebank (Word Level)	Ensemble of All	Validation perplexity	48.92	# 9
Language Modelling	Penn Treebank (Word Level)	Ensemble of All	Test perplexity	47.31	# 10
Language Modelling	WikiText-103	Ensemble of All	Validation perplexity	13.11	# 1
Language Modelling	WikiText-103	Ensemble of All	Test perplexity	13.29	# 7
Language Modelling	WikiText-2	Ensemble of All	Validation perplexity	55.4	# 14
Language Modelling	WikiText-2	Ensemble of All	Test perplexity	53.73	# 21

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/advancing-state-of-the-art-in-language/language-modelling-on-wikitext-103)](https://paperswithcode.com/sota/language-modelling-on-wikitext-103?p=advancing-state-of-the-art-in-language)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/advancing-state-of-the-art-in-language/language-modelling-on-penn-treebank-word)](https://paperswithcode.com/sota/language-modelling-on-penn-treebank-word?p=advancing-state-of-the-art-in-language)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/advancing-state-of-the-art-in-language/language-modelling-on-wikitext-2)](https://paperswithcode.com/sota/language-modelling-on-wikitext-2?p=advancing-state-of-the-art-in-language)`

Advancing State of the Art in Language Modeling

28 Nov 2023 · David Herel, Tomas Mikolov ·

Generalization is arguably the most important goal of statistical language modeling research. Publicly available benchmarks and papers published with an open-source code have been critical to advancing the field. However, it is often very difficult, and sometimes even impossible, to reproduce the results fully as reported in publications. In this paper, we propose a simple framework that should help advance the state of the art in language modeling in terms of generalization. We propose to publish not just the code, but also probabilities on dev and test sets with future publications so that one can easily add the new model into an ensemble. This has crucial advantages: it is much easier to determine whether a newly proposed model is actually complementary to the current baseline. Therefore, instead of inventing new names for the old tricks, the scientific community can advance faster. Finally, this approach promotes diversity of ideas: one does not need to create an individual model that is the new state of the art to attract attention; it will be sufficient to develop a new model that learns patterns which other models do not. Thus, even a suboptimal model can be found to have value. Remarkably, our approach has yielded new state-of-the-art results across various language modeling benchmarks up to 10%.

PDF Abstract

Code

Add Remove Mark official

davidherel/sota_lm official

Tasks

Add Remove

Language Modelling

Datasets

Penn Treebank

WikiText-2

WikiText-103

Results from the Paper

Edit

Ranked #7 on Language Modelling on WikiText-103

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Language Modelling	Penn Treebank (Word Level)	Ensemble of All	Validation perplexity	48.92	# 9	Compare
Language Modelling	Penn Treebank (Word Level)	Ensemble of All	Test perplexity	47.31	# 10	Compare
Language Modelling	WikiText-103	Ensemble of All	Validation perplexity	13.11	# 1	Compare
Language Modelling	WikiText-103	Ensemble of All	Test perplexity	13.29	# 7	Compare
Language Modelling	WikiText-2	Ensemble of All	Validation perplexity	55.4	# 14	Compare
Language Modelling	WikiText-2	Ensemble of All	Test perplexity	53.73	# 21	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Advancing State of the Art in Language Modeling

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove