Video Prediction

183 papers with code • 19 benchmarks • 24 datasets

Video Prediction is the task of predicting future frames given past video frames.

Gif credit: MAGVIT

Source: Photo-Realistic Video Prediction on Natural Videos of Largely Changing Frames

Libraries

Use these libraries to find Video Prediction models and implementations

Latest papers with no code

Learning 3D Particle-based Simulators from RGB-D Videos

no code yet • 8 Dec 2023

Realistic simulation is critical for applications ranging from robotics to animation.

ViP-Mixer: A Convolutional Mixer for Video Prediction

no code yet • 20 Nov 2023

Video prediction aims to predict future frames from a video's previous content.

SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction

no code yet • 31 Oct 2023

The goal is to generate high-quality long videos with smooth and creative transitions between scenes and varying lengths of shot-level videos.

Variational Inference for SDEs Driven by Fractional Noise

no code yet • 19 Oct 2023

In this paper, building upon the Markov approximation of fBM, we derive the evidence lower bound essential for efficient variational inference of posterior path measures, drawing from the well-established field of stochastic analysis.

Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation

no code yet • 9 Oct 2023

While Large Language Models (LLMs) are the dominant models for generative tasks in language, they do not perform as well as diffusion models on image and video generation.

Future Video Prediction from a Single Frame for Video Anomaly Detection

no code yet • 15 Aug 2023

Inspired by the abilities of the future frame prediction proxy-task, we introduce the task of future video prediction from a single frame, as a novel proxy-task for video anomaly detection.

S-HR-VQVAE: Sequential Hierarchical Residual Learning Vector Quantized Variational Autoencoder for Video Prediction

no code yet • 13 Jul 2023

We address the video prediction task by putting forth a novel model that combines (i) our recently proposed hierarchical residual vector quantized variational autoencoder (HR-VQVAE), and (ii) a novel spatiotemporal PixelCNN (ST-PixelCNN).

Action-conditioned Deep Visual Prediction with RoAM, a new Indoor Human Motion Dataset for Autonomous Robots

no code yet • 28 Jun 2023

With the increasing adoption of robots across industries, it is crucial to focus on developing advanced algorithms that enable robots to anticipate, comprehend, and plan their actions effectively in collaboration with humans.

Physion++: Evaluating Physical Scene Understanding that Requires Online Inference of Different Physical Properties

no code yet • NeurIPS 2023

Specifically, we test scenarios where accurate prediction relies on estimates of properties such as mass, friction, elasticity, and deformability, and where the values of those properties can only be inferred by observing how objects move and interact with other objects or fluids.

SlotDiffusion: Object-Centric Generative Modeling with Diffusion Models

no code yet • NeurIPS 2023

Finally, we demonstrate the scalability of SlotDiffusion to unconstrained real-world datasets such as PASCAL VOC and COCO, when integrated with self-supervised pre-trained image encoders.