Language Modelling

1210 papers with code • 15 benchmarks • 117 datasets

Language modeling is the task of predicting the next word or character in a document.

( Image credit: Exploring the Limits of Language Modeling )

Greatest papers with code

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

tensorflow/models ICLR 2020

Then, instead of training a model that predicts the original identities of the corrupted tokens, we train a discriminative model that predicts whether each token in the corrupted input was replaced by a generator sample or not.

Language Modelling Natural Language Understanding +2

Neural Architecture Search with Reinforcement Learning

tensorflow/models 5 Nov 2016

Our cell achieves a test set perplexity of 62. 4 on the Penn Treebank, which is 3. 6 perplexity better than the previous state-of-the-art model.

Image Classification Language Modelling +2

Exploring the Limits of Language Modeling

tensorflow/models 7 Feb 2016

In this work we explore recent advances in Recurrent Neural Networks for large scale Language Modeling, a task central to language understanding.

Language Modelling

One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling

tensorflow/models 11 Dec 2013

We propose a new benchmark corpus to be used for measuring progress in statistical language modeling.

Language Modelling

Semi-supervised Sequence Learning

tensorflow/models NeurIPS 2015

In our experiments, we find that long short term memory recurrent networks after being pretrained with the two approaches are more stable and generalize better.

Language Modelling Text Classification

Talking-Heads Attention

tensorflow/models 5 Mar 2020

We introduce "talking-heads attention" - a variation on multi-head attention which includes linearprojections across the attention-heads dimension, immediately before and after the softmax operation. While inserting only a small number of additional parameters and a moderate amount of additionalcomputation, talking-heads attention leads to better perplexities on masked language modeling tasks, aswell as better quality when transfer-learning to language comprehension and question answering tasks.

Language Modelling Question Answering +1

Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation

huggingface/transformers 15 Aug 2020

Traditional NLP has long held (supervised) syntactic parsing necessary for successful higher-level semantic language understanding (LU).

Language Modelling Natural Language Understanding

MPNet: Masked and Permuted Pre-training for Language Understanding

huggingface/transformers NeurIPS 2020

Since BERT neglects dependency among predicted tokens, XLNet introduces permuted language modeling (PLM) for pre-training to address this problem.

Language Modelling

Longformer: The Long-Document Transformer

huggingface/transformers 10 Apr 2020

To address this limitation, we introduce the Longformer with an attention mechanism that scales linearly with sequence length, making it easy to process documents of thousands of tokens or longer.

Language Modelling Question Answering

Reformer: The Efficient Transformer

huggingface/transformers ICLR 2020

Large Transformer models routinely achieve state-of-the-art results on a number of tasks but training these models can be prohibitively costly, especially on long sequences.

Image Generation Language Modelling +1