Video Generation
264 papers with code • 15 benchmarks • 14 datasets
( Various Video Generation Tasks. Gif credit: MaGViT )
Libraries
Use these libraries to find Video Generation models and implementationsDatasets
Most implemented papers
Hierarchical Video Generation from Orthogonal Information: Optical Flow and Texture
FlowGAN generates optical flow, which contains only the edge and motion of the videos to be begerated.
Stochastic Video Generation with a Learned Prior
Sample generations are both varied and sharp, even many frames into the future, and compare favorably to those from existing approaches.
Point-to-Point Video Generation
We introduce point-to-point video generation that controls the generation process with two control points: the targeted start- and end-frames.
Hierarchical Patch VAE-GAN: Generating Diverse Videos from a Single Sample
We consider the task of generating diverse and novel videos from a single video sample.
VideoGPT: Video Generation using VQ-VAE and Transformers
We present VideoGPT: a conceptually simple architecture for scaling likelihood based generative modeling to natural videos.
Video Diffusion Models
Generating temporally coherent high fidelity video is an important milestone in generative modeling research.
Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives
In light of this, we propose the Disentangled Objective Video Quality Evaluator (DOVER) to learn the quality of UGC videos based on the two perspectives.
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation
To replicate the success of text-to-image (T2I) generation, recent works employ large-scale video datasets to train a text-to-video (T2V) generator.
Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models
We first pre-train an LDM on images only; then, we turn the image generator into a video generator by introducing a temporal dimension to the latent space diffusion model and fine-tuning on encoded image sequences, i. e., videos.
FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling
With the availability of large-scale video datasets and the advances of diffusion models, text-driven video generation has achieved substantial progress.