Recurrent Neural Networks

TSRUc, or Transformation-based Spatial Recurrent Unit c, is a modification of a ConvGRU used in the TriVD-GAN architecture for video generation.

Instead of computing the reset gate $r$ and resetting $h_{t−1}$, the TSRUc computes the parameters of a transformation $\theta$, which we use to warp $h_{t−1}$. The rest of our model is unchanged (with $\hat{h}_{t-1}$ playing the role of $h'_{t}$ in $c$’s update equation from ConvGRU. The TSRUc module is described by the following equations:

$$ \theta_{h,x} = f\left(h_{t−1}, x_{t}\right) $$

$$ \hat{h}_{t-1} = w\left(h_{t-1}; \theta_{h, x}\right) $$

$$ c = \rho\left(W_{c} \star_{n}\left[\hat{h}_{t-1};x_{t}\right] + b_{c} \right) $$

$$ u = \sigma\left(W_{u} \star_{n}\left[h_{t-1};x_{t}\right] + b_{u} \right) $$

$$ h_{t} = u \odot h_{t-1} + \left(1-u\right) \odot c $$

In these equations $\sigma$ and $\rho$ are the elementwise sigmoid and ReLU functions respectively and the $\star_{n}$ represents a convolution with a kernel of size $n \times n$. Brackets are used to represent a feature concatenation.

Source: Transformation-based Adversarial Video Prediction on Large-Scale Data


Paper Code Results Date Stars


Task Papers Share
Video Generation 1 50.00%
Video Prediction 1 50.00%