Browse > Natural Language Processing > Language Modelling

Language Modelling

238 papers with code · Natural Language Processing

Language modeling is the task of predicting the next word or character in a document.

* indicates models using dynamic evaluation; where, at test time, models may adapt to seen tokens in order to improve performance on following tokens. (Mikolov et al., (2010), Kraus et al., (2017))

State-of-the-art leaderboards

Greatest papers with code

Exploring the Limits of Language Modeling

7 Feb 2016tensorflow/models

In this work we explore recent advances in Recurrent Neural Networks for large scale Language Modeling, a task central to language understanding. We extend current models to deal with two key challenges present in this task: corpora and vocabulary sizes, and complex, long term structure of language.

LANGUAGE MODELLING

Semi-supervised Sequence Learning

NeurIPS 2015 tensorflow/models

The first approach is to predict what comes next in a sequence, which is a conventional language model in natural language processing. In our experiments, we find that long short term memory recurrent networks after being pretrained with the two approaches are more stable and generalize better.

LANGUAGE MODELLING TEXT CLASSIFICATION

One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling

11 Dec 2013tensorflow/models

We propose a new benchmark corpus to be used for measuring progress in statistical language modeling. With almost one billion words of training data, we hope this benchmark will be useful to quickly evaluate novel language modeling techniques, and to compare their contribution when combined with other advanced techniques.

LANGUAGE MODELLING

Universal Transformers

10 Jul 2018tensorflow/tensor2tensor

In this paper we propose the Universal Transformer which addresses these practical and theoretical shortcomings and we show that it leads to improved performance on several tasks. We further employ an adaptive computation time (ACT) mechanism to allow the model to dynamically adjust the number of times the representation of each position in a sequence is revised.

LANGUAGE MODELLING LEARNING TO EXECUTE MACHINE TRANSLATION

Discrete Autoencoders for Sequence Models

ICLR 2018 tensorflow/tensor2tensor

Recurrent models for sequences have been recently successful at many tasks, especially for language modeling and machine translation. We propose to improve the representation in sequence models by augmenting current approaches with an autoencoder that is forced to compress the sequence through an intermediate discrete latent space.

LANGUAGE MODELLING MACHINE TRANSLATION

Building state-of-the-art distant speech recognition using the CHiME-4 challenge with a setup of speech enhancement baseline

27 Mar 2018kaldi-asr/kaldi

This paper describes a new baseline system for automatic speech recognition (ASR) in the CHiME-4 challenge to promote the development of noisy ASR in speech processing communities by providing 1) state-of-the-art system with a simplified single system comparable to the complicated top systems in the challenge, 2) publicly available and reproducible recipe through the main repository in the Kaldi speech recognition toolkit. In addition, the proposed baseline recipe includes four different speech enhancement measures, short-time objective intelligibility measure (STOI), extended STOI (eSTOI), perceptual evaluation of speech quality (PESQ) and speech distortion ratio (SDR) for the simulation test set.

DISTANT SPEECH RECOGNITION LANGUAGE MODELLING NOISY SPEECH RECOGNITION SPEECH ENHANCEMENT

Deep contextualized word representations

HLT 2018 zalandoresearch/flair

We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e.g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i.e., to model polysemy). Our word vectors are learned functions of the internal states of a deep bidirectional language model (biLM), which is pre-trained on a large text corpus.

COREFERENCE RESOLUTION LANGUAGE MODELLING NAMED ENTITY RECOGNITION NATURAL LANGUAGE INFERENCE QUESTION ANSWERING SEMANTIC ROLE LABELING SENTIMENT ANALYSIS

The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations

7 Nov 2015facebookresearch/ParlAI

We introduce a new test of how well language models capture meaning in children's books. Unlike standard language modelling benchmarks, it distinguishes the task of predicting syntactic function words from that of predicting lower-frequency words, which carry greater semantic content.

LANGUAGE MODELLING

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context

ICLR 2019 huggingface/pytorch-pretrained-BERT

Transformer networks have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling. As a solution, we propose a novel neural architecture, Transformer-XL, that enables Transformer to learn dependency beyond a fixed length without disrupting temporal coherence.

LANGUAGE MODELLING

First-Pass Large Vocabulary Continuous Speech Recognition using Bi-Directional Recurrent DNNs

12 Aug 2014baidu-research/warp-ctc

Recent work demonstrated the feasibility of discarding the HMM sequence modeling framework by directly predicting transcript text from audio. This approach to decoding enables first-pass speech recognition with a language model, completely unaided by the cumbersome infrastructure of HMM-based systems.

LANGUAGE MODELLING LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION SPEECH RECOGNITION