Browse > Natural Language Processing > Language Modelling

Language Modelling

238 papers with code · Natural Language Processing

Language modeling is the task of predicting the next word or character in a document.

* indicates models using dynamic evaluation; where, at test time, models may adapt to seen tokens in order to improve performance on following tokens. (Mikolov et al., (2010), Kraus et al., (2017))

State-of-the-art leaderboards

Latest papers with code

Language Models are Unsupervised Multitask Learners

Preprint 2019 openai/gpt-2

Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets. We demonstrate that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText.

COMMON SENSE REASONING DOCUMENT SUMMARIZATION LANGUAGE MODELLING MACHINE TRANSLATION QUESTION ANSWERING READING COMPREHENSION

Pay Less Attention with Lightweight and Dynamic Convolutions

ICLR 2019 pytorch/fairseq

Self-attention is a useful mechanism to build generative models for language and images. We predict separate convolution kernels based solely on the current time-step in order to determine the importance of context elements.

ABSTRACTIVE TEXT SUMMARIZATION LANGUAGE MODELLING MACHINE TRANSLATION

Improving Neural Network Quantization without Retraining using Outlier Channel Splitting

28 Jan 2019NervanaSystems/distiller

The majority of existing literature focuses on training quantized DNNs, while this work examines the less-studied topic of quantizing a floating-point model without (re)training. OCS requires no additional training and works on commodity hardware.

LANGUAGE MODELLING

Cross-lingual Language Model Pretraining

22 Jan 2019facebookresearch/XLM

On unsupervised machine translation, we obtain 34.3 BLEU on WMT'16 German-English, improving the previous state of the art by more than 9 BLEU. On supervised machine translation, we obtain a new state of the art of 38.5 BLEU on WMT'16 Romanian-English, outperforming the previous best approach by more than 4 BLEU.

LANGUAGE MODELLING UNSUPERVISED MACHINE TRANSLATION

Robust Chinese Word Segmentation with Contextualized Word Representations

17 Jan 2019voidism/pywordseg

In recent years, after the neural-network-based method was proposed, the accuracy of the Chinese word segmentation task has made great progress. However, when dealing with out-of-vocabulary words, there is still a large error rate.

CHINESE WORD SEGMENTATION LANGUAGE MODELLING

17 Jan 2019

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

ICLR 2019 huggingface/pytorch-pretrained-BERT

Transformer networks have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling. As a solution, we propose a novel neural architecture, Transformer-XL, that enables Transformer to learn dependency beyond a fixed length without disrupting temporal coherence.

LANGUAGE MODELLING

Team Papelo: Transformer Networks at FEVER

WS 2018 cdmalon/finetune-transformer-lm

We develop a system for the FEVER fact extraction and verification challenge that uses a high precision entailment classifier based on transformer networks pretrained with language modeling, to classify a broad set of potential evidence. The precision of the entailment classifier allows us to enhance recall by considering every statement from several articles to decide upon each claim.

LANGUAGE MODELLING

08 Jan 2019

Looking for ELMo's Friends: Sentence-Level Pretraining Beyond Language Modeling

ICLR 2019 jsalt18-sentence-repl/jiant

Work on the problem of contextualized word representation---the development of reusable neural network components for sentence understanding---has recently seen a surge of progress centered on the unsupervised pretraining task of language modeling with methods like ELMo. This paper contributes the first large-scale systematic study comparing different pretraining tasks in this context, both as complements to language modeling and as potential alternatives.

LANGUAGE MODELLING

What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models

21 Dec 2018fdalvi/NeuroX

We break this analysis down further and study individual dimensions (neurons) in the vector representation learned by end-to-end neural models in NLP tasks. We further present a comprehensive analysis of neurons with the aim to address the following questions: i) how localized or distributed are different linguistic properties in the models?

LANGUAGE MODELLING MACHINE TRANSLATION

21 Dec 2018

Learning Private Neural Language Modeling with Attentive Aggregation

17 Dec 2018shaoxiongji/fed-att

Federated learning (FL) provides a promising approach to learning private language modeling for intelligent personalized keyboard suggestion by training models in distributed clients rather than training in a central server. To solve these problems, we propose a novel model aggregation with the attention mechanism considering the contribution of clients models to the global model, together with an optimization technique during server aggregation.

LANGUAGE MODELLING

17 Dec 2018