UnICORNN: A recurrent model for learning very long time dependencies

9 Mar 2021  ·  T. Konstantin Rusch, Siddhartha Mishra ·

The design of recurrent neural networks (RNNs) to accurately process sequential inputs with long-time dependencies is very challenging on account of the exploding and vanishing gradient problem. To overcome this, we propose a novel RNN architecture which is based on a structure preserving discretization of a Hamiltonian system of second-order ordinary differential equations that models networks of oscillators. The resulting RNN is fast, invertible (in time), memory efficient and we derive rigorous bounds on the hidden state gradients to prove the mitigation of the exploding and vanishing gradient problem. A suite of experiments are presented to demonstrate that the proposed RNN provides state of the art performance on a variety of learning tasks with (very) long-time dependencies.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Time Series Classification EigenWorms IndRNN % Test Accuracy 49.7 # 6
Time Series Classification EigenWorms expRNN % Test Accuracy 40.0 # 8
Time Series Classification EigenWorms UnICORNN % Test Accuracy 90.3 # 2
Time Series Classification EigenWorms coRNN % Test Accuracy 86.7 # 3
Sentiment Analysis IMDb UnICORNN Accuracy 88.4 # 35
Sequential Image Classification noise padded CIFAR-10 UnICORNN % Test Accuracy 62.4% # 2
Sequential Image Classification Sequential MNIST UnICORNN Permuted Accuracy 98.4% # 7

Methods


No methods listed for this paper. Add relevant methods here