A MAX-AFFINE SPLINE PERSPECTIVE OF RECURRENT NEURAL NETWORKS

ICLR 2019 · Zichao Wang, Randall Balestriero, Richard Baraniuk ·

We develop a framework for understanding and improving recurrent neural net-works (RNNs) using max-affine spline operators (MASO). We prove that RNNs using piecewise affine and convex nonlinearities can be written as a simple piecewise affine spline operator. The resulting representation provides several new perspectives for analyzing RNNs, three of which we study in this paper. First, we show that an RNN internally partitions the input space during training and that it builds up the partition through time. Second, we show that the affine parameter of an RNN corresponds to an input-specific template, from which we can interpret an RNN as performing a simple template matching (matched filtering) given the input. Third, by closely examining the MASO RNN formula, we prove that injecting Gaussian noise in the initial hidden state in RNNs corresponds to an explicit L2 regularization on the affine parameters, which links to exploding gradient issues and improves generalization. Extensive experiments on several datasets of various modalities demonstrate and validate each of the above analyses. In particular, using initial hidden states elevates simple RNNs to state-of-the-art performance on these datasets.

PDF Abstract