Video Generation
239 papers with code • 15 benchmarks • 14 datasets
( Various Video Generation Tasks. Gif credit: MaGViT )
Libraries
Use these libraries to find Video Generation models and implementationsDatasets
Latest papers
DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers
Scaling large models with long sequences across applications like language generation, video generation and multimodal tasks requires efficient sequence parallelism.
Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts
Despite recent advances in image-to-video generation, better controllability and local animation are less explored.
DragAnything: Motion Control for Anything using Entity Representation
We introduce DragAnything, which utilizes a entity representation to achieve motion control for any object in controllable video generation.
SSM Meets Video Diffusion Models: Efficient Video Generation with Structured State Spaces
In the experiments, we first evaluate our SSM-based model with UCF101, a standard benchmark of video generation.
VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models
In this paper, we introduce VidProM, the first large-scale dataset comprising 1. 67 million unique text-to-video prompts from real users.
VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models
Different from conventional T2V sampling (i. e., temporal and spatial modeling), VideoElevator explicitly decomposes each sampling step into temporal motion refining and spatial quality elevating.
UniCtrl: Improving the Spatiotemporal Consistency of Text-to-Video Diffusion Models via Training-Free Unified Attention Control
Video Diffusion Models have been developed for video generation, usually integrating text and image conditioning to enhance control over the generated content.
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Sora is a text-to-video generative AI model, released by OpenAI in February 2024.
VGMShield: Mitigating Misuse of Video Generative Models
Together with fake video detection and tracing, our multi-faceted set of solutions can effectively mitigate misuse of video generative models.
Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation
Diffusion models have proven to be highly effective in image and video generation; however, they still face composition challenges when generating images of varying sizes due to single-scale training data.