In this work we explore recent advances in Recurrent Neural Networks for large scale Language Modeling, a task central to language understanding. We extend current models to deal with two key challenges present in this task: corpora and vocabulary sizes, and complex, long term structure of language. We perform an exhaustive study on techniques such as character Convolutional Neural Networks or Long-Short Term Memory, on the One Billion Word Benchmark.
|Task||Dataset||Model||Metric name||Metric value||Global rank||Compare|
|Language Modelling||One Billion Word||LSTM-8192-1024 + CNN Input||PPL||30.0||# 8|
|Language Modelling||One Billion Word||LSTM-8192-1024 + CNN Input||Number of params||1.04B||# 8|
|Language Modelling||One Billion Word||LSTM-8192-1024||PPL||30.6||# 9|
|Language Modelling||One Billion Word||LSTM-8192-1024||Number of params||1.8B||# 9|