About

Language modeling is the task of predicting the next word or character in a document.

( Image credit: Exploring the Limits of Language Modeling )

Benchmarks

TREND DATASET BEST METHOD PAPER TITLE PAPER CODE COMPARE

Libraries

Subtasks

Datasets

Greatest papers with code

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

ICLR 2020 tensorflow/models

Then, instead of training a model that predicts the original identities of the corrupted tokens, we train a discriminative model that predicts whether each token in the corrupted input was replaced by a generator sample or not.

4 LANGUAGE MODELLING NATURAL LANGUAGE UNDERSTANDING QUESTION ANSWERING

Semi-supervised Sequence Learning

NeurIPS 2015 tensorflow/models

In our experiments, we find that long short term memory recurrent networks after being pretrained with the two approaches are more stable and generalize better.

LANGUAGE MODELLING TEXT CLASSIFICATION

Neural Architecture Search with Reinforcement Learning

5 Nov 2016tensorflow/models

Our cell achieves a test set perplexity of 62. 4 on the Penn Treebank, which is 3. 6 perplexity better than the previous state-of-the-art model.

IMAGE CLASSIFICATION LANGUAGE MODELLING NATURAL LANGUAGE UNDERSTANDING NEURAL ARCHITECTURE SEARCH

Exploring the Limits of Language Modeling

7 Feb 2016tensorflow/models

In this work we explore recent advances in Recurrent Neural Networks for large scale Language Modeling, a task central to language understanding.

LANGUAGE MODELLING

One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling

11 Dec 2013tensorflow/models

We propose a new benchmark corpus to be used for measuring progress in statistical language modeling.

LANGUAGE MODELLING

Talking-Heads Attention

5 Mar 2020tensorflow/models

We introduce "talking-heads attention" - a variation on multi-head attention which includes linearprojections across the attention-heads dimension, immediately before and after the softmax operation. While inserting only a small number of additional parameters and a moderate amount of additionalcomputation, talking-heads attention leads to better perplexities on masked language modeling tasks, aswell as better quality when transfer-learning to language comprehension and question answering tasks.

LANGUAGE MODELLING QUESTION ANSWERING TRANSFER LEARNING

Is Supervised Syntactic Parsing Beneficial for Language Understanding? An Empirical Investigation

15 Aug 2020huggingface/transformers

Traditional NLP has long held (supervised) syntactic parsing necessary for successful higher-level language understanding.

LANGUAGE MODELLING

MPNet: Masked and Permuted Pre-training for Language Understanding

NeurIPS 2020 huggingface/transformers

Since BERT neglects dependency among predicted tokens, XLNet introduces permuted language modeling (PLM) for pre-training to address this problem.

LANGUAGE MODELLING

Longformer: The Long-Document Transformer

10 Apr 2020huggingface/transformers

To address this limitation, we introduce the Longformer with an attention mechanism that scales linearly with sequence length, making it easy to process documents of thousands of tokens or longer.

LANGUAGE MODELLING

Reformer: The Efficient Transformer

ICLR 2020 huggingface/transformers

Large Transformer models routinely achieve state-of-the-art results on a number of tasks but training these models can be prohibitively costly, especially on long sequences.

LANGUAGE MODELLING OPEN-DOMAIN QUESTION ANSWERING