PredRNN++: Towards A Resolution of the Deep-in-Time Dilemma in Spatiotemporal Predictive Learning

We present PredRNN++, an improved recurrent network for video predictive learning. In pursuit of a greater spatiotemporal modeling capability, our approach increases the transition depth between adjacent states by leveraging a novel recurrent unit, which is named Causal LSTM for re-organizing the spatial and temporal memories in a cascaded mechanism. However, there is still a dilemma in video predictive learning: increasingly deep-in-time models have been designed for capturing complex variations, while introducing more difficulties in the gradient back-propagation. To alleviate this undesirable effect, we propose a Gradient Highway architecture, which provides alternative shorter routes for gradient flows from outputs back to long-range inputs. This architecture works seamlessly with causal LSTMs, enabling PredRNN++ to capture short-term and long-term dependencies adaptively. We assess our model on both synthetic and real video datasets, showing its ability to ease the vanishing gradient problem and yield state-of-the-art prediction results even in a difficult objects occlusion scenario.

PDF Abstract ICML 2018 PDF ICML 2018 Abstract

Results from the Paper

 Ranked #1 on Video Prediction on KTH (Cond metric)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Video Prediction KTH PredRNN++ PSNR 28.47 # 6
SSIM 0.865 # 7
Cond 10 # 1
Pred 20 # 1
Video Prediction Moving MNIST Causal LSTM MSE 46.5 # 24
MAE 106.8 # 18
SSIM 0.898 # 19
Video Prediction SynpickVP PredRNN++ MSE 51.73 # 1
PSNR 27.50 # 2
SSIM 0.894 # 1
LPIPS 0.053 # 2