Text-to-Video Editing

Most implemented papers

FateZero: Fusing Attentions for Zero-shot Text-based Video Editing

chenyangqiqi/fatezero ICCV 2023

We also have a better zero-shot shape-aware editing ability based on the text-to-video model.

ControlVideo: Conditional Control for One-shot Text-driven Video Editing and Beyond

thu-ml/controlvideo 26 May 2023

This paper presents \emph{ControlVideo} for text-driven video editing -- generating a video that aligns with a given text while preserving the structure of the source video.

Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising

g-u-n/gen-l-video 29 May 2023

To address this challenge, we introduce a novel paradigm dubbed as Gen-L-Video, capable of extending off-the-shelf short video diffusion models for generating and editing videos comprising hundreds of frames with diverse semantic segments without introducing additional training, all while preserving content consistency.

Cross-Modal Contextualized Diffusion Models for Text-Guided Visual Generation and Editing

yangling0818/contextdiff 26 Feb 2024

To address this issue, we propose a novel and general contextualized diffusion model (ContextDiff) by incorporating the cross-modal context encompassing interactions and alignments between text condition and visual sample into forward and reverse processes.