Recurrent Neural Networks

Unitary RNN

Introduced by Arjovsky et al. in Unitary Evolution Recurrent Neural Networks

A Unitary RNN is a recurrent neural network architecture that uses a unitary hidden to hidden matrix. Specifically they concern dynamics of the form:

$$ h_{t} = f\left(Wh_{t−1} + Vx_{t}\right) $$

where $W$ is a unitary matrix $\left(W^{†}W = I\right)$. The product of unitary matrices is a unitary matrix, so $W$ can be parameterised as a product of simpler unitary matrices:

$$ h_{t} = f\left(D_{3}R_{2}F^{−1}D_{2}PR_{1}FD_{1}h_{t−1} + Vxt\right) $$

where $D_{3}$, $D_{2}$, $D_{1}$ are learned diagonal complex matrices, and $R_{2}$, $R_{1}$ are learned reflection matrices. Matrices $F$ and $F^{−1}$ are the discrete Fourier transformation and its inverse. P is any constant random permutation. The activation function $f\left(h\right)$ applies a rectified linear unit with a learned bias to the modulus of each complex number. Only the diagonal and reflection matrices, $D$ and $R$, are learned, so Unitary RNNs have fewer parameters than LSTMs with comparable numbers of hidden units.

Source: Associative LSTMs

Source: Unitary Evolution Recurrent Neural Networks


Paper Code Results Date Stars



Component Type
Activation Functions