Search Results for author: Debang Li

Found 11 papers, 10 papers with code

SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video Diffusion Transformers

3 code implementations1 Jun 2025 Zhengcong Fei, Hao Jiang, Di Qiu, Baoxuan Gu, Youqiang Zhang, Jiahua Wang, Jialin Bai, Debang Li, Mingyuan Fan, Guibin Chen, Yahui Zhou

The generation and editing of audio-conditioned talking portraits guided by multimodal inputs, including text, images, and videos, remains under explored.

Denoising

SkyReels-V2: Infinite-length Film Generative Model

1 code implementation17 Apr 2025 Guibin Chen, Dixuan Lin, Jiangping Yang, Chunze Lin, Junchen Zhu, Mingyuan Fan, Hao Zhang, Sheng Chen, Zheng Chen, Chengcheng Ma, Weiming Xiong, Wei Wang, Nuo Pang, Kang Kang, Zhiheng Xu, Yuzhe Jin, Yupeng Liang, Yubing Song, Peng Zhao, Boyuan Xu, Di Qiu, Debang Li, Zhengcong Fei, Yang Li, Yahui Zhou

Recent advances in video generation have been driven by diffusion models and autoregressive frameworks, yet critical challenges persist in harmonizing prompt adherence, visual quality, motion dynamics, and duration: compromises in motion dynamics to enhance temporal visual quality, constrained video duration (5-10 seconds) to prioritize resolution, and inadequate shot-aware generation stemming from general-purpose MLLMs' inability to interpret cinematic grammar, such as shot composition, actor expressions, and camera motions.

Large Language Model model +2

SkyReels-A2: Compose Anything in Video Diffusion Transformers

1 code implementation3 Apr 2025 Zhengcong Fei, Debang Li, Di Qiu, Jiahua Wang, Yikun Dou, Rui Wang, Jingtao Xu, Mingyuan Fan, Guibin Chen, Yang Li, Yahui Zhou

This paper presents SkyReels-A2, a controllable video generation framework capable of assembling arbitrary visual elements (e. g., characters, objects, backgrounds) into synthesized videos based on textual prompts while maintaining strict consistency with reference images for each element.

Video Generation

Ingredients: Blending Custom Photos with Video Diffusion Transformers

1 code implementation3 Jan 2025 Zhengcong Fei, Debang Li, Di Qiu, Changqian Yu, Mingyuan Fan

This paper presents a powerful framework to customize video creations by incorporating multiple specific identity (ID) photos, with video diffusion Transformers, referred to as \texttt{Ingredients}.

Video Diffusion Transformers are In-Context Learners

1 code implementation14 Dec 2024 Zhengcong Fei, Di Qiu, Changqian Yu, Debang Li, Mingyuan Fan, Xiang Wen

This paper investigates a solution for enabling in-context capabilities of video diffusion transformers, with minimal tuning required for activation.

Video Generation

Scaling Diffusion Transformers to 16 Billion Parameters

1 code implementation16 Jul 2024 Zhengcong Fei, Mingyuan Fan, Changqian Yu, Debang Li, Junshi Huang

In this paper, we present DiT-MoE, a sparse version of the diffusion Transformer, that is scalable and competitive with dense networks while exhibiting highly optimized inference.

Attribute Conditional Image Generation +2

Dimba: Transformer-Mamba Diffusion Models

no code implementations3 Jun 2024 Zhengcong Fei, Mingyuan Fan, Changqian Yu, Debang Li, Youqiang Zhang, Junshi Huang

This paper unveils Dimba, a new text-to-image diffusion model that employs a distinctive hybrid architecture combining Transformer and Mamba elements.

Mamba Text to Image Generation +1

Composing Good Shots by Exploiting Mutual Relations

1 code implementation CVPR 2020 Debang Li, Junge Zhang, Kaiqi Huang, Ming-Hsuan Yang

However, the mutual relations between the candidates from an image play an essential role in composing a good shot due to the comparative nature of this problem.

Data Augmentation

Learning to Learn Cropping Models for Different Aspect Ratio Requirements

1 code implementation CVPR 2020 Debang Li, Junge Zhang, Kaiqi Huang

In addition, both the intermediate and final results show that the proposed model can predict different cropping windows for an image depending on different aspect ratio requirements.

Image Cropping Meta-Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.