Video Prediction

183 papers with code • 19 benchmarks • 24 datasets

Video Prediction is the task of predicting future frames given past video frames.

Gif credit: MAGVIT

Source: Photo-Realistic Video Prediction on Natural Videos of Largely Changing Frames

Benchmarks

Add a Result

These leaderboards are used to track progress in Video Prediction

Dataset	Best Model	Compare
KTH	Grid-keypoints	See all
Moving MNIST	SimVP+gSTA-Sx10	See all
Kinetics-600 12 frames, 64x64	MAGVIT-v2	See all
Human3.6M	IAM4VP	See all
BAIR Robot Pushing	MAGVIT (-L-FP)	See all
Cityscapes 128x128	GHVAEs	See all
SynpickVP	MSPred	See all
CMU Mocap-2	Latent SDE	See all
Cityscapes	DMVFN	See all
KITTI	DMVFN	See all
CMU Mocap-1	ODE2VAE-KL	See all
DAVIS 2017	DMVFN	See all
Vimeo90K	OPT	See all
Colored dSprites	MGP-VAE (with geodesic loss)	See all
Sprites	MGP-VAE (with geodesic loss)	See all
YouTube-8M	SDCNet	See all
KTH 64x64 cond10 pred30	SRVP	See all
Something-Something V2	MAGVIT	See all
MPI Sintel	MCnet [villegas2017mcnet]	See all

Show all 19 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Video Prediction models and implementations

chengtan9907/simvpv2

10 papers

568

chengtan9907/OpenSTL

4 papers

571

Flunzmas/vp-suite

3 papers

tensorflow/tensor2tensor

2 papers

14,878

See all 6 libraries.

Datasets

Subtasks

Latest papers

Most implemented Social Latest No code

Generalized Predictive Model for Autonomous Driving

opendrivelab/driveagi • 14 Mar 2024

In this paper, we introduce the first large-scale video prediction model in the autonomous driving discipline.

359

14 Mar 2024

Paper
Code

General surgery vision transformer: A video pre-trained foundation model for general surgery

samuelschmidgall/gsvit • • 9 Mar 2024

The absence of openly accessible data and specialized foundation models is a major barrier for computational research in surgery.

09 Mar 2024

Paper
Code

Switch EMA: A Free Lunch for Better Flatness and Sharpness

Westlake-AI/openmixup • • 14 Feb 2024

Exponential Moving Average (EMA) is a widely used weight averaging (WA) regularization to learn flat optima for better generalizations without extra cost in deep neural network (DNN) optimization.

568

14 Feb 2024

Paper
Code

STDiff: Spatio-temporal Diffusion for Continuous Stochastic Video Prediction

xiye20/stdiffproject • • 11 Dec 2023

Predicting future frames of a video is challenging because it is difficult to learn the uncertainty of the underlying factors influencing their contents.

11 Dec 2023

Paper
Code

SVQ: Sparse Vector Quantization for Spatiotemporal Forecasting

Pachark/SVQ-Forecasting • • 6 Dec 2023

Moreover, we approximate the sparse regression process using a blend of a two-layer MLP and an extensive codebook.

06 Dec 2023

Paper
Code

Rethinking Urban Mobility Prediction: A Super-Multivariate Time Series Forecasting Approach

chengyui/sumformer • • 4 Dec 2023

To address this challenge, we present the Super-Multivariate Urban Mobility Transformer (SUMformer), which utilizes a specially designed attention mechanism to calculate temporal and cross-variable correlations and reduce computational costs stemming from a large number of time series.

04 Dec 2023

Paper
Code

Pair-wise Layer Attention with Spatial Masking for Video Prediction

mlvccn/pla_sm_videopred • • 19 Nov 2023

To this end, we present a Pair-wise Layer Attention with Spatial Masking (PLA-SM) framework for video prediction to capture the spatiotemporal dynamics, which reflect the motion trend.

19 Nov 2023

Paper
Code

MMVP: Motion-Matrix-based Video Prediction

kay1794/mmvp-motion-matrix-based-video-prediction • • ICCV 2023

A central challenge of video prediction lies where the system has to reason the objects' future motions from image frames while simultaneously maintaining the consistency of their appearances across frames.

30 Aug 2023

Paper
Code

SwinLSTM:Improving Spatiotemporal Prediction Accuracy using Swin Transformer and LSTM

SongTang-x/SwinLSTM • • 19 Aug 2023

In this paper, we propose a new recurrent cell, SwinLSTM, which integrates Swin Transformer blocks and the simplified LSTM, an extension that replaces the convolutional structure in ConvLSTM with the self-attention mechanism.

19 Aug 2023

Paper
Code

Neural Multigrid Memory For Computational Fluid Dynamics

combi2k2/mg-turbulent-flow • • 21 Jun 2023

Turbulent flow simulation plays a crucial role in various applications, including aircraft and ship design, industrial process optimization, and weather prediction.

21 Jun 2023

Paper
Code

Video Prediction

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result