Exploring the Limits of Language Modeling

7 Feb 2016Rafal JozefowiczOriol VinyalsMike SchusterNoam ShazeerYonghui Wu

In this work we explore recent advances in Recurrent Neural Networks for large scale Language Modeling, a task central to language understanding. We extend current models to deal with two key challenges present in this task: corpora and vocabulary sizes, and complex, long term structure of language... (read more)

PDF Abstract

Evaluation results from the paper


Task Dataset Model Metric name Metric value Global rank Compare
Language Modelling One Billion Word LSTM-8192-1024 + CNN Input PPL 30.0 # 8
Language Modelling One Billion Word LSTM-8192-1024 + CNN Input Number of params 1.04B # 1
Language Modelling One Billion Word LSTM-8192-1024 PPL 30.6 # 9
Language Modelling One Billion Word LSTM-8192-1024 Number of params 1.8B # 1