Z-Order Recurrent Neural Networks for Video Prediction
We present a Z-Order RNN (Znet) for predicting future video frames given historical observations. There are two main contributions respectively in deterministic and stochastic modeling perspective. First, we propose a new RNN architecture for modeling the deterministic dynamics, which updates hidden states along a z-order curve to enhance the consistency of the features of mirrored layers. Second, we introduce an adversarial training approach to two-stream Znet for modeling the stochastic variations, which forces the Znet-Predictor to imitate the behavior of the Znet-Probe. This two-stream architecture enables the adversarial training to be conducted in the feature space instead of the image space. Our model achieves the state-of-the-art prediction accuracy on two video datasets.
PDF