Browse > Natural Language Processing > Language Modelling

Language Modelling

282 papers with code · Natural Language Processing

Language modeling is the task of predicting the next word or character in a document.

* indicates models using dynamic evaluation; where, at test time, models may adapt to seen tokens in order to improve performance on following tokens. (Mikolov et al., (2010), Kraus et al., (2017))

State-of-the-art leaderboards

Latest papers without code

Trellis Networks for Sequence Modeling

ICLR 2019 Shaojie Bai et al

On the other hand, we show that truncated recurrent networks are equivalent to trellis networks with special sparsity structure in their weight matrices.

LANGUAGE MODELLING

01 May 2019

Improved Language Modeling by Decoding the Past

ICLR 2019 Siddhartha Brahma

With negligible overhead in the number of parameters and training time, our Past Decode Regularization (PDR) method achieves a word level perplexity of 55. 6 on the Penn Treebank and 63. 5 on the WikiText-2 datasets using a single softmax.

LANGUAGE MODELLING

01 May 2019

Transformer-XL: Language Modeling with Longer-Term Dependency

ICLR 2019 Zihang Dai* et al

Moreover, Transformer-XL is up to 1, 800+ times faster than vanilla Transformer during evaluation.

LANGUAGE MODELLING

01 May 2019

Looking for ELMo's friends: Sentence-Level Pretraining Beyond Language Modeling

ICLR 2019 Samuel R. Bowman et al

Work on the problem of contextualized word representation—the development of reusable neural network components for sentence understanding—has recently seen a surge of progress centered on the unsupervised pretraining task of language modeling with methods like ELMo (Peters et al., 2018).

LANGUAGE MODELLING

01 May 2019

Partially Mutual Exclusive Softmax for Positive and Unlabeled data

ICLR 2019 Ugo Tanielian et al

This is often the case for applications such as language modeling, next event prediction and matrix factorization, where many of the potential outcomes are not mutually exclusive, but are more likely to be independent conditionally on the state.

LANGUAGE MODELLING

01 May 2019

RedSync : Reducing Synchronization Traffic for Distributed Deep Learning

ICLR 2019 Jiarui Fang et al

Data parallelism has become a dominant method to scale Deep Neural Network (DNN) training across multiple nodes.

IMAGE CLASSIFICATION LANGUAGE MODELLING

01 May 2019

Language Modeling with Graph Temporal Convolutional Networks

ICLR 2019 Hongyin Luo et al

Recently, there have been some attempts to use non-recurrent neural models for language modeling.

LANGUAGE MODELLING

01 May 2019

Precision Highway for Ultra Low-precision Quantization

ICLR 2019 Eunhyeok Park et al

Quantization of a neural network has an inherent problem called accumulated quantization error, which is the key obstacle towards ultra-low precision, e. g., 2- or 3-bit precision.

LANGUAGE MODELLING QUANTIZATION

01 May 2019

Do Language Models Have Common Sense?

ICLR 2019 Trieu H. Trinh et al

It has been argued that current machine learning models do not have commonsense, and therefore must be hard-coded with prior knowledge (Marcus, 2018).

COMMON SENSE REASONING LANGUAGE MODELLING

01 May 2019

Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity

ICLR 2019 Thomas Miconi et al

We show that neuromodulated plasticity improves the performance of neural networks on both reinforcement learning and supervised learning tasks.

LANGUAGE MODELLING

01 May 2019