Video Prediction

192 papers with code • 19 benchmarks • 24 datasets

Video Prediction is the task of predicting future frames given past video frames.

Gif credit: MAGVIT

Source: Photo-Realistic Video Prediction on Natural Videos of Largely Changing Frames

Libraries

Use these libraries to find Video Prediction models and implementations

Latest papers with no code

Video Prediction Policy: A Generalist Robot Policy with Predictive Visual Representations

no code yet • 19 Dec 2024

Recent advancements in robotics have focused on developing generalist policies capable of performing multiple tasks.

STIV: Scalable Text and Image Conditioned Video Generation

no code yet • 10 Dec 2024

The field of video generation has made remarkable advancements, yet there remains a pressing need for a clear, systematic recipe that can guide the development of robust and scalable models.

Efficient Continuous Video Flow Model for Video Prediction

no code yet • 7 Dec 2024

In this paper, we propose a novel approach to modeling the multi-step process, aimed at alleviating latency constraints and facilitating the adaptation of such processes for video prediction tasks.

Continuous Video Process: Modeling Videos as Continuous Multi-Dimensional Processes for Video Prediction

no code yet • 6 Dec 2024

Diffusion models have made significant strides in image generation, mastering tasks such as unconditional image synthesis, text-image translation, and image-to-image conversions.

Lightweight Stochastic Video Prediction via Hybrid Warping

no code yet • 4 Dec 2024

Accurate video prediction by deep neural networks, especially for dynamic regions, is a challenging task in computer vision for critical applications such as autonomous driving, remote working, and telemedicine.

Distributed solar generation forecasting using attention-based deep neural networks for cloud movement prediction

no code yet • 17 Nov 2024

We investigate and discuss the impact of cloud forecasts from attention-based methods towards forecasting distributed solar generation, compared to cloud forecasts from non-attention-based methods.

Pre-trained Visual Dynamics Representations for Efficient Policy Learning

no code yet • 5 Nov 2024

To address the challenge of pre-training with videos, we propose Pre-trained Visual Dynamics Representations (PVDR) to bridge the domain gap between videos and downstream tasks for efficient policy learning.

Video prediction using score-based conditional density estimation

no code yet • 30 Oct 2024

Furthermore, analysis of networks trained on natural image sequences reveals that the representation automatically weights predictive evidence by its reliability, which is a hallmark of statistical inference

GHIL-Glue: Hierarchical Control with Filtered Subgoal Images

no code yet • 26 Oct 2024

Image and video generative models that are pre-trained on Internet-scale data can greatly increase the generalization capacity of robot learning systems.

Dynamic 3D Gaussian Tracking for Graph-Based Neural Dynamics Modeling

no code yet • 24 Oct 2024

However, existing video prediction approaches typically do not explicitly account for the 3D information from videos, such as robot actions and objects' 3D states, limiting their use in real-world robotic applications.