Learning Longer-term Dependencies in RNNs with Auxiliary Losses

ICML 2018 Trieu H. TrinhAndrew M. DaiMinh-Thang LuongQuoc V. Le

Despite recent advances in training recurrent neural networks (RNNs), capturing long-term dependencies in sequences remains a fundamental challenge. Most approaches use backpropagation through time (BPTT), which is difficult to scale to very long sequences... (read more)

PDF Abstract
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT BENCHMARK
Sequential Image Classification Sequential CIFAR-10 Transformer (self-attention) (Trinh et al., 2018) Unpermuted Accuracy 62.2% # 2

Methods used in the Paper