Recurrent Highway Networks

ICML 2017 Julian Georg ZillyRupesh Kumar SrivastavaJan KoutníkJürgen Schmidhuber

Many sequential processing tasks require complex nonlinear transition functions from one step to the next. However, recurrent neural networks with 'deep' transition functions remain difficult to train, even when using Long Short-Term Memory (LSTM) networks... (read more)

PDF Abstract ICML 2017 PDF ICML 2017 Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Language Modelling enwik8 Recurrent Highway Networks Bit per Character (BPC) 1.27 # 21
Number of params 46M # 15
Language Modelling Hutter Prize RHN - depth 5 [zilly2016recurrent] Bit per Character (BPC) 1.31 # 11
Language Modelling Hutter Prize Large RHN Bit per Character (BPC) 1.27 # 9
Number of params 46M # 5
Language Modelling Penn Treebank (Word Level) Recurrent highway networks Validation perplexity 67.9 # 23
Test perplexity 65.4 # 30
Params 23M # 8
Language Modelling Text8 Large RHN Bit per Character (BPC) 1.27 # 11
Number of params 46M # 6

Methods used in the Paper