Video Generation
243 papers with code • 15 benchmarks • 14 datasets
( Various Video Generation Tasks. Gif credit: MaGViT )
Libraries
Use these libraries to find Video Generation models and implementationsDatasets
Latest papers
VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models
Different from conventional T2V sampling (i. e., temporal and spatial modeling), VideoElevator explicitly decomposes each sampling step into temporal motion refining and spatial quality elevating.
UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control
Video Diffusion Models have been developed for video generation, usually integrating text and image conditioning to enhance control over the generated content.
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Sora is a text-to-video generative AI model, released by OpenAI in February 2024.
VGMShield: Mitigating Misuse of Video Generative Models
Together with fake video detection and tracing, our multi-faceted set of solutions can effectively mitigate misuse of video generative models.
Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation
Diffusion models have proven to be highly effective in image and video generation; however, they still face composition challenges when generating images of varying sizes due to single-scale training data.
Magic-Me: Identity-Specific Video Customized Diffusion
To achieve this, we propose three novel components that are essential for high-quality identity preservation and stable video generation: 1) a noise initialization method with 3D Gaussian Noise Prior for better inter-frame stability; 2) an ID module based on extended Textual Inversion trained with the cropped identity to disentangle the ID information from the background 3) Face VCD and Tiled VCD modules to reinforce faces and upscale the video to higher resolution while preserving the identity's features.
InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions
We introduce $\textit{InteractiveVideo}$, a user-centric framework for video generation.
AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning
We validate the proposed strategy in image-conditioned video generation and layout-conditioned video generation, all achieving top-performing results.
DDMI: Domain-Agnostic Latent Diffusion Models for Synthesizing High-Quality Implicit Neural Representations
Arguably, this architecture limits the expressive power of generative models and results in low-quality INR generation.
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
Based on this stronger coupling, we shift the distribution to higher quality without motion degradation by finetuning spatial modules with high-quality images, resulting in a generic high-quality video model.