An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

4 Mar 2018 Shaojie Bai J. Zico Kolter Vladlen Koltun

For most deep learning practitioners, sequence modeling is synonymous with recurrent networks. Yet recent results indicate that convolutional architectures can outperform recurrent networks on tasks such as audio synthesis and machine translation... (read more)

PDF Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Music Modeling Nottingham TCN NLL 3.07 # 2
Music Modeling Nottingham GRU NLL 3.46 # 5
Music Modeling Nottingham LSTM NLL 3.29 # 3
Music Modeling Nottingham RNN NLL 4.05 # 6
Language Modelling Penn Treebank (Character Level) Temporal Convolutional Network Bit per Character (BPC) 1.31 # 17
Language Modelling Penn Treebank (Word Level) GRU (Bai et al., 2018) Test perplexity 92.48 # 39
Language Modelling Penn Treebank (Word Level) LSTM (Bai et al., 2018) Test perplexity 78.93 # 35
Sequential Image Classification Sequential MNIST Temporal Convolutional Network Unpermuted Accuracy 99.0% # 10
Permuted Accuracy 97.2% # 6
Language Modelling WikiText-103 TCN Test perplexity 45.19 # 51

Methods used in the Paper


METHOD TYPE
🤖 No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet