Browse > Computer Vision > Video > Video Prediction

Video Prediction

18 papers with code · Computer Vision
Subtask of Video

State-of-the-art leaderboards

No evaluation results yet. Help compare methods by submit evaluation metrics.

Greatest papers with code

Unsupervised Learning for Physical Interaction through Video Prediction

NeurIPS 2016 tensorflow/models

A core challenge for an agent learning to interact with the world is to predict how its actions affect objects in its environment. Many existing methods for learning the dynamics of physical interactions require labeled object information.

VIDEO PREDICTION

Video-to-Video Synthesis

NeurIPS 2018 NVIDIA/vid2vid

We study the problem of video-to-video synthesis, whose goal is to learn a mapping function from an input source video (e.g., a sequence of semantic segmentation masks) to an output photorealistic video that precisely depicts the content of the source video. Without understanding temporal dynamics, directly applying existing image synthesis approaches to an input video often results in temporally incoherent videos of low visual quality.

SEMANTIC SEGMENTATION VIDEO PREDICTION VIDEO-TO-VIDEO SYNTHESIS

Learning a Driving Simulator

3 Aug 2016commaai/research

Comma.ai's approach to Artificial Intelligence for self-driving cars is based on an agent that learns to clone driver behaviors and plans maneuvers by simulating future events in the road. This paper illustrates one of our research approaches for driving simulation.

SELF-DRIVING CARS VIDEO PREDICTION

Deep multi-scale video prediction beyond mean square error

17 Nov 2015dyelax/Adversarial_Video_Generation

Learning to predict future images from a video sequence involves the construction of an internal representation that models the image evolution accurately, and therefore, to some degree, its content and dynamics. This is why pixel-space video prediction may be viewed as a promising avenue for unsupervised feature learning.

OPTICAL FLOW ESTIMATION VIDEO PREDICTION

Deep Predictive Coding Networks for Video Prediction and Unsupervised Learning

25 May 2016coxlab/prednet

Here, we explore prediction of future frames in a video sequence as an unsupervised learning rule for learning about the structure of the visual world. We describe a predictive neural network ("PredNet") architecture that is inspired by the concept of "predictive coding" from the neuroscience literature.

OBJECT RECOGNITION VIDEO PREDICTION

Stochastic Adversarial Video Prediction

ICLR 2019 alexlee-gk/video_prediction

However, learning to predict raw future observations, such as frames in a video, is exceedingly challenging -- the ambiguous nature of the problem can cause a naively designed model to average together possible futures into a single, blurry prediction. Recently, this has been addressed by two distinct approaches: (a) latent variational variable models that explicitly model underlying stochasticity and (b) adversarially-trained models that aim to produce naturalistic images.

REPRESENTATION LEARNING VIDEO PREDICTION

Predicting Deeper into the Future of Semantic Segmentation

ICCV 2017 facebookresearch/SegmPred

The ability to predict and therefore to anticipate the future is an important attribute of intelligence. It is also of utmost importance in real-time systems, e.g. in robotics or autonomous driving, which depend on visual scene understanding for decision making.

AUTONOMOUS DRIVING DECISION MAKING OPTICAL FLOW ESTIMATION SCENE UNDERSTANDING SEMANTIC SEGMENTATION VIDEO PREDICTION

Prediction Under Uncertainty with Error-Encoding Networks

14 Nov 2017mbhenaff/EEN

In this work we introduce a new framework for performing temporal predictions in the presence of uncertainty. It is based on a simple idea of disentangling components of the future state which are predictable from those which are inherently unpredictable, and encoding the unpredictable components into a low-dimensional latent variable which is fed into a forward model.

VIDEO PREDICTION

Predicting Future Instance Segmentation by Forecasting Convolutional Features

ECCV 2018 facebookresearch/instpred

Anticipating future events is an important prerequisite towards intelligent behavior. We apply the "detection head'" of Mask R-CNN on the predicted features to produce the instance segmentation of future frames.

INSTANCE SEGMENTATION OPTICAL FLOW ESTIMATION SEMANTIC SEGMENTATION VIDEO PREDICTION

The "something something" video database for learning and evaluating visual common sense

ICCV 2017 TwentyBN/smth-smth-v2-baseline-with-models

Neural networks trained on datasets such as ImageNet have led to major advances in visual object classification. One obstacle that prevents networks from reasoning more deeply about complex scenes and situations, and from integrating visual knowledge with natural language, like humans do, is their lack of common sense knowledge about the physical world.

COMMON SENSE REASONING OBJECT CLASSIFICATION VIDEO PREDICTION