Exploring the Limits of Language Modeling

7 Feb 2016Rafal JozefowiczOriol VinyalsMike SchusterNoam ShazeerYonghui Wu

In this work we explore recent advances in Recurrent Neural Networks for large scale Language Modeling, a task central to language understanding. We extend current models to deal with two key challenges present in this task: corpora and vocabulary sizes, and complex, long term structure of language... (read more)

PDF Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT LEADERBOARD
Language Modelling One Billion Word LSTM-8192-1024 + CNN Input PPL 30.0 # 9
Number of params 1.04B # 1
Language Modelling One Billion Word LSTM-8192-1024 PPL 30.6 # 10
Number of params 1.8B # 1