Video Prediction

183 papers with code • 19 benchmarks • 24 datasets

Video Prediction is the task of predicting future frames given past video frames.

Gif credit: MAGVIT

Source: Photo-Realistic Video Prediction on Natural Videos of Largely Changing Frames

Benchmarks

Add a Result

These leaderboards are used to track progress in Video Prediction

Dataset	Best Model	Compare
KTH	Grid-keypoints	See all
Moving MNIST	SimVP+gSTA-Sx10	See all
Kinetics-600 12 frames, 64x64	MAGVIT-v2	See all
Human3.6M	IAM4VP	See all
BAIR Robot Pushing	MAGVIT (-L-FP)	See all
Cityscapes 128x128	GHVAEs	See all
SynpickVP	MSPred	See all
CMU Mocap-2	Latent SDE	See all
Cityscapes	DMVFN	See all
KITTI	DMVFN	See all
CMU Mocap-1	ODE2VAE-KL	See all
DAVIS 2017	DMVFN	See all
Vimeo90K	OPT	See all
Colored dSprites	MGP-VAE (with geodesic loss)	See all
Sprites	MGP-VAE (with geodesic loss)	See all
YouTube-8M	SDCNet	See all
KTH 64x64 cond10 pred30	SRVP	See all
Something-Something V2	MAGVIT	See all
MPI Sintel	MCnet [villegas2017mcnet]	See all

Show all 19 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Video Prediction models and implementations

chengtan9907/simvpv2

10 papers

573

chengtan9907/OpenSTL

4 papers

573

Flunzmas/vp-suite

3 papers

tensorflow/tensor2tensor

2 papers

14,883

See all 6 libraries.

Datasets

Subtasks

Latest papers with no code

Most implemented Social Latest No code

Learning 3D Particle-based Simulators from RGB-D Videos

no code yet • 8 Dec 2023

Realistic simulation is critical for applications ranging from robotics to animation.

Paper
Add Code

ViP-Mixer: A Convolutional Mixer for Video Prediction

no code yet • 20 Nov 2023

Video prediction aims to predict future frames from a video's previous content.

Paper
Add Code

SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction

no code yet • 31 Oct 2023

The goal is to generate high-quality long videos with smooth and creative transitions between scenes and varying lengths of shot-level videos.

Paper
Add Code

Variational Inference for SDEs Driven by Fractional Noise

no code yet • 19 Oct 2023

In this paper, building upon the Markov approximation of fBM, we derive the evidence lower bound essential for efficient variational inference of posterior path measures, drawing from the well-established field of stochastic analysis.

Paper
Add Code

Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation

no code yet • 9 Oct 2023

While Large Language Models (LLMs) are the dominant models for generative tasks in language, they do not perform as well as diffusion models on image and video generation.

Paper
Add Code

Future Video Prediction from a Single Frame for Video Anomaly Detection

no code yet • 15 Aug 2023

Inspired by the abilities of the future frame prediction proxy-task, we introduce the task of future video prediction from a single frame, as a novel proxy-task for video anomaly detection.

Paper
Add Code

S-HR-VQVAE: Sequential Hierarchical Residual Learning Vector Quantized Variational Autoencoder for Video Prediction

no code yet • 13 Jul 2023

We address the video prediction task by putting forth a novel model that combines (i) our recently proposed hierarchical residual vector quantized variational autoencoder (HR-VQVAE), and (ii) a novel spatiotemporal PixelCNN (ST-PixelCNN).

Paper
Add Code

Action-conditioned Deep Visual Prediction with RoAM, a new Indoor Human Motion Dataset for Autonomous Robots

no code yet • 28 Jun 2023

With the increasing adoption of robots across industries, it is crucial to focus on developing advanced algorithms that enable robots to anticipate, comprehend, and plan their actions effectively in collaboration with humans.

Paper
Add Code

Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties

no code yet • NeurIPS 2023

Specifically, we test scenarios where accurate prediction relies on estimates of properties such as mass, friction, elasticity, and deformability, and where the values of those properties can only be inferred by observing how objects move and interact with other objects or fluids.

Paper
Add Code

SlotDiffusion: Object-Centric Generative Modeling with Diffusion Models

no code yet • NeurIPS 2023

Finally, we demonstrate the scalability of SlotDiffusion to unconstrained real-world datasets such as PASCAL VOC and COCO, when integrated with self-supervised pre-trained image encoders.

Paper
Add Code

Video Prediction

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result