Video Prediction

142 papers with code • 14 benchmarks • 17 datasets

Video Prediction is the task of predicting future frames given past video frames.

Source: Photo-Realistic Video Prediction on Natural Videos of Largely Changing Frames


Use these libraries to find Video Prediction models and implementations

Latest papers with no code

Allo-centric Occupancy Grid Prediction for Urban Traffic Scene Using Video Prediction Networks

no code yet • 11 Jan 2023

This allows for the static scene to remain fixed and to represent motion of the ego-vehicle on the grid like other agents'.

Long-horizon video prediction using a dynamic latent hierarchy

no code yet • 29 Dec 2022

The task of video prediction and generation is known to be notoriously difficult, with the research in this area largely limited to short-term predictions.

Motion and Context-Aware Audio-Visual Conditioned Video Prediction

no code yet • 9 Dec 2022

Existing state-of-the-art method for audio-visual conditioned video prediction uses the latent codes of the audio-visual frames from a multimodal stochastic network and a frame encoder to predict the next visual frame.

MIMO Is All You Need : A Strong Multi-In-Multi-Out Baseline for Video Prediction

no code yet • 9 Dec 2022

The mainstream of the existing approaches for video prediction builds up their models based on a Single-In-Single-Out (SISO) architecture, which takes the current frame as input to predict the next frame in a recursive manner.

Randomized Conditional Flow Matching for Video Prediction

no code yet • 26 Nov 2022

We call our model Random frame conditional flow Integration for VidEo pRediction, or, in short, RIVER.

Tell Me What Happened: Unifying Text-guided Video Completion via Multimodal Masked Video Generation

no code yet • 23 Nov 2022

Inspired by this, we introduce a novel task, text-guided video completion (TVC), which requests the model to generate a video from partial frames guided by an instruction.

3D-CSL: self-supervised 3D context similarity learning for Near-Duplicate Video Retrieval

no code yet • 10 Nov 2022

In this paper, we introduce 3D-CSL, a compact pipeline for Near-Duplicate Video Retrieval (NDVR), and explore a novel self-supervised learning strategy for video similarity learning.

SlotFormer: Unsupervised Visual Dynamics Simulation with Object-Centric Models

no code yet • 12 Oct 2022

While recent object-centric models can successfully decompose a scene into objects, modeling their dynamics effectively still remains a challenge.

Continuous conditional video synthesis by neural processes

no code yet • 11 Oct 2022

We propose a unified model for multiple conditional video synthesis tasks, including video prediction and video frame interpolation.

See, Plan, Predict: Language-guided Cognitive Planning with Video Prediction

no code yet • 7 Oct 2022

Cognitive planning is the structural decomposition of complex tasks into a sequence of future behaviors.