1 code implementation • 18 Mar 2024 • Jianzhi Liu, Junchen Zhu, Lianli Gao, Jingkuan Song
Recent large-scale video datasets have facilitated the generation of diverse open-domain videos of Video Diffusion Models (VDMs).
1 code implementation • 13 Mar 2024 • Cheng Chen, Junchen Zhu, Xu Luo, HengTao Shen, Lianli Gao, Jingkuan Song
To this end, we introduce MoELoRA to MLLMs which is effective to retain the previous instruction alignment.
no code implementations • 17 Jan 2024 • Jiaqi Guo, Sitong Su, Junchen Zhu, Lianli Gao, Jingkuan Song
Therefore, we propose a training-free pipeline employing a pre-trained diffusion model imbued with semantic prior knowledge, which can process composite videos with broader semantic disparities.
1 code implementation • 25 Nov 2023 • Chen Cheng, Jingkuan Song, Xiaosu Zhu, Junchen Zhu, Lianli Gao, HengTao Shen
To address this issue, after analyzing the phenomenon and identifying the lack of diversity as a vital factor, we propose a method named Codebook for Unsupervised Continual Learning (CUCL) which promotes the model to learn discriminative features to complete the class boundary.
no code implementations • 31 Jul 2023 • Junchen Zhu, Huan Yang, Wenjing Wang, Huiguo He, Zixi Tuo, Yongsheng Yu, Wen-Huang Cheng, Lianli Gao, Jingkuan Song, Jianlong Fu, Jiebo Luo
In the basic generation, we take advantage of the pretrained image diffusion model, and adapt it to a high-quality open-domain vertical video generator for mobile devices.
no code implementations • 12 Jun 2023 • Junchen Zhu, Huan Yang, Huiguo He, Wenjing Wang, Zixi Tuo, Wen-Huang Cheng, Lianli Gao, Jingkuan Song, Jianlong Fu
To generate videos, we extend the capabilities of a pretrained text-to-image diffusion model through a two-stage process.
1 code implementation • 18 May 2023 • Wenjing Wang, Huan Yang, Zixi Tuo, Huiguo He, Junchen Zhu, Jianlong Fu, Jiaying Liu
Moreover, to fully unlock model capabilities for high-quality video generation and promote the development of the field, we curate a large-scale and open-source video dataset called HD-VG-130M.
Ranked #1 on Text-to-Video Generation on WebVid