Video Generation

243 papers with code • 15 benchmarks • 14 datasets

( Various Video Generation Tasks. Gif credit: MaGViT )

Benchmarks

Add a Result

These leaderboards are used to track progress in Video Generation

Dataset	Best Model	Compare
UCF-101	W.A.L.T-XL (class-conditional)	See all
BAIR Robot Pushing	MAGVIT	See all
Sky Time-lapse	StyleSV (256x256)	See all
UCF-101 16 frames, 64x64, Unconditional	Make-A-Video (ours) vs. CogVideo (Chinese)	See all
UCF-101 16 frames, Unconditional, Single GPU	TGAN-F	See all
LAION-400M	Imagen original (constant=6)	See all
Taichi	StyleSV (256x256)	See all
UCF-101 16 frames, 128x128, Unconditional	TGANv2 (2020)	See all
Kinetics-600 12 frames, 64x64	W.A.L.T-L	See all
TrailerFaces	PG-SWGAN-3D	See all
Kinetics-600 48 frames, 64x64	DVD-GAN	See all
Kinetics-600 12 frames, 128x128	DVD-GAN	See all
How2Sign	INR-V	See all
YouTube Driving	StyleSV	See all
MSR-VTT	VideoAssembler (Zero-Shot, 256x256, class-conditional)	See all

Show all 15 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Video Generation models and implementations

faceonlive/ai-research

3 papers

174

stability-ai/generative-models

2 papers

22,321

nvlabs/long-video-gan

2 papers

300

Datasets

Subtasks

Latest papers

Most implemented Social Latest No code

VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models

ybybzhang/videoelevator • • 8 Mar 2024

Different from conventional T2V sampling (i. e., temporal and spatial modeling), VideoElevator explicitly decomposes each sampling step into temporal motion refining and spatial quality elevating.

117

08 Mar 2024

Paper
Code

UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control

XuweiyiChen/UniCtrl • • 4 Mar 2024

Video Diffusion Models have been developed for video generation, usually integrating text and image conditioning to enhance control over the generated content.

04 Mar 2024

Paper
Code

Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

lichao-sun/sorareview • • 27 Feb 2024

Sora is a text-to-video generative AI model, released by OpenAI in February 2024.

452

27 Feb 2024

Paper
Code

VGMShield: Mitigating Misuse of Video Generative Models

py85252876/mmvgm • • 20 Feb 2024

Together with fake video detection and tracing, our multi-faceted set of solutions can effectively mitigate misuse of video generative models.

20 Feb 2024

Paper
Code

Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation

guolanqing/self-cascade • 16 Feb 2024

Diffusion models have proven to be highly effective in image and video generation; however, they still face composition challenges when generating images of varying sizes due to single-scale training data.

16 Feb 2024

Paper
Code

Magic-Me: Identity-Specific Video Customized Diffusion

zhen-dong/magic-me • • 14 Feb 2024

To achieve this, we propose three novel components that are essential for high-quality identity preservation and stable video generation: 1) a noise initialization method with 3D Gaussian Noise Prior for better inter-frame stability; 2) an ID module based on extended Textual Inversion trained with the cropped identity to disentangle the ID information from the background 3) Face VCD and Tiled VCD modules to reinforce faces and upscale the video to higher resolution while preserving the identity's features.

427

14 Feb 2024

Paper
Code

InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions

invictus717/interactivevideo • • 5 Feb 2024

We introduce $\textit{InteractiveVideo}$, a user-centric framework for video generation.

112

05 Feb 2024

Paper
Code

AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning

g-u-n/animatelcm • • 1 Feb 2024

We validate the proposed strategy in image-conditioned video generation and layout-conditioned video generation, all achieving top-performing results.

423

01 Feb 2024

Paper
Code

DDMI: Domain-Agnostic Latent Diffusion Models for Synthesizing High-Quality Implicit Neural Representations

mlvlab/DDMI • • 23 Jan 2024

Arguably, this architecture limits the expressive power of generative models and results in low-quality INR generation.

23 Jan 2024

Paper
Code

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

videocrafter/videocrafter • • 17 Jan 2024

Based on this stronger coupling, we shift the distribution to higher quality without motion degradation by finetuning spatial modules with high-quality images, resulting in a generic high-quality video model.

4,082

17 Jan 2024

Paper
Code

Video Generation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result