Language Modelling

1152 papers with code • 12 benchmarks • 117 datasets

Language modeling is the task of predicting the next word or character in a document.

( Image credit: Exploring the Limits of Language Modeling )

Greatest papers with code

FlauBERT: Unsupervised Language Model Pre-training for French

huggingface/transformers LREC 2020

Language models have become a key step to achieve state-of-the art results in many different Natural Language Processing (NLP) tasks.

Language Modelling Natural Language Inference +2

Plug and Play Language Models: A Simple Approach to Controlled Text Generation

huggingface/transformers ICLR 2020

Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities.

Language Modelling Text Generation

Unsupervised Cross-lingual Representation Learning at Scale

huggingface/transformers ACL 2020

We also present a detailed empirical analysis of the key factors that are required to achieve these gains, including the trade-offs between (1) positive transfer and capacity dilution and (2) the performance of high and low resource languages at scale.

Cross-Lingual Transfer Language Modelling +1

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

huggingface/transformers NeurIPS 2019

As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large models in on-the-edge and/or under constrained computational training or inference budgets remains challenging.

Hate Speech Detection Knowledge Distillation +7

LXMERT: Learning Cross-Modality Encoder Representations from Transformers

huggingface/transformers IJCNLP 2019

In LXMERT, we build a large-scale Transformer model that consists of three encoders: an object relationship encoder, a language encoder, and a cross-modality encoder.

Language Modelling Question Answering +2

RoBERTa: A Robustly Optimized BERT Pretraining Approach

huggingface/transformers 26 Jul 2019

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.

Common Sense Reasoning Language Modelling +6

XLNet: Generalized Autoregressive Pretraining for Language Understanding

huggingface/transformers NeurIPS 2019

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.

Document Ranking Humor Detection +7

Language Models are Unsupervised Multitask Learners

huggingface/transformers Preprint 2019

Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets.

 Ranked #1 on Language Modelling on enwik8 (using extra training data)

Common Sense Reasoning Data-to-Text Generation +6

Cross-lingual Language Model Pretraining

huggingface/transformers NeurIPS 2019

On unsupervised machine translation, we obtain 34. 3 BLEU on WMT'16 German-English, improving the previous state of the art by more than 9 BLEU.

Language Modelling Natural Language Understanding +1