no code implementations • 5 Dec 2024 • Longtao Zheng, Yifan Zhang, Hanzhong Guo, Jiachun Pan, Zhenxiong Tan, Jiahao Lu, Chuanxin Tang, Bo An, Shuicheng Yan
Recent advances in video diffusion models have unlocked new potential for realistic audio-driven talking video generation.
2 code implementations • 22 Nov 2024 • Zhenxiong Tan, Songhua Liu, Xingyi Yang, Qiaochu Xue, Xinchao Wang
In this paper, we introduce OminiControl, a highly versatile and parameter-efficient framework that integrates image conditions into pre-trained Diffusion Transformer (DiT) models.
1 code implementation • 3 Sep 2024 • Songhua Liu, Weihao Yu, Zhenxiong Tan, Xinchao Wang
Modern diffusion models, particularly those utilizing a Transformer-based UNet for denoising, rely heavily on self-attention operations to manage complex spatial relationships, thus achieving impressive generation performance.
1 code implementation • 15 Jul 2024 • Zhenxiong Tan, Xinyin Ma, Gongfan Fang, Xinchao Wang
Latent diffusion models have shown promising results in audio generation, making notable advancements over traditional methods.
no code implementations • 24 Jun 2024 • Zhenxiong Tan, Xingyi Yang, Songhua Liu, Xinchao Wang
Specifically, we propose two coherent mechanisms: Clip parallelism and Dual-scope attention.
2 code implementations • 11 Jun 2024 • Zigeng Chen, Xinyin Ma, Gongfan Fang, Zhenxiong Tan, Xinchao Wang
To address this, we introduce AsyncDiff, a universal and plug-and-play acceleration scheme that enables model parallelism across multiple devices.
1 code implementation • CVPR 2024 • Shizun Wang, Songhua Liu, Zhenxiong Tan, Xinchao Wang
Currently, brain decoding is confined to a per-subject-per-model paradigm, limiting its applicability to the same individual for whom the decoding model is trained.
no code implementations • 13 Nov 2023 • Zhenxiong Tan, Kaixin Wang, Xinchao Wang
We present C-Procgen, an enhanced suite of environments on top of the Procgen benchmark.
1 code implementation • CVPR 2020 • Chen Gao, Yunpeng Chen, Si Liu, Zhenxiong Tan, Shuicheng Yan
In this paper, we propose an AdversarialNAS method specially tailored for Generative Adversarial Networks (GANs) to search for a superior generative model on the task of unconditional image generation.