Video Prediction
183 papers with code • 19 benchmarks • 24 datasets
Video Prediction is the task of predicting future frames given past video frames.
Gif credit: MAGVIT
Source: Photo-Realistic Video Prediction on Natural Videos of Largely Changing Frames
Libraries
Use these libraries to find Video Prediction models and implementationsDatasets
Latest papers
Generalized Predictive Model for Autonomous Driving
In this paper, we introduce the first large-scale video prediction model in the autonomous driving discipline.
General surgery vision transformer: A video pre-trained foundation model for general surgery
The absence of openly accessible data and specialized foundation models is a major barrier for computational research in surgery.
Switch EMA: A Free Lunch for Better Flatness and Sharpness
Exponential Moving Average (EMA) is a widely used weight averaging (WA) regularization to learn flat optima for better generalizations without extra cost in deep neural network (DNN) optimization.
STDiff: Spatio-temporal Diffusion for Continuous Stochastic Video Prediction
Predicting future frames of a video is challenging because it is difficult to learn the uncertainty of the underlying factors influencing their contents.
SVQ: Sparse Vector Quantization for Spatiotemporal Forecasting
Moreover, we approximate the sparse regression process using a blend of a two-layer MLP and an extensive codebook.
Rethinking Urban Mobility Prediction: A Super-Multivariate Time Series Forecasting Approach
To address this challenge, we present the Super-Multivariate Urban Mobility Transformer (SUMformer), which utilizes a specially designed attention mechanism to calculate temporal and cross-variable correlations and reduce computational costs stemming from a large number of time series.
Pair-wise Layer Attention with Spatial Masking for Video Prediction
To this end, we present a Pair-wise Layer Attention with Spatial Masking (PLA-SM) framework for video prediction to capture the spatiotemporal dynamics, which reflect the motion trend.
MMVP: Motion-Matrix-based Video Prediction
A central challenge of video prediction lies where the system has to reason the objects' future motions from image frames while simultaneously maintaining the consistency of their appearances across frames.
SwinLSTM:Improving Spatiotemporal Prediction Accuracy using Swin Transformer and LSTM
In this paper, we propose a new recurrent cell, SwinLSTM, which integrates Swin Transformer blocks and the simplified LSTM, an extension that replaces the convolutional structure in ConvLSTM with the self-attention mechanism.
Neural Multigrid Memory For Computational Fluid Dynamics
Turbulent flow simulation plays a crucial role in various applications, including aircraft and ship design, industrial process optimization, and weather prediction.