no code implementations • 5 Jan 2025 • Weikang Bian, Zhaoyang Huang, Xiaoyu Shi, Yijin Li, Fu-Yun Wang, Hongsheng Li
Specifically, we propose a novel framework that constructs a pseudo 4D Gaussian field with dense 3D point tracking and renders the Gaussian field for all video frames.
no code implementations • 27 Oct 2024 • Yijin Li, Yichen Shen, Zhaoyang Huang, Shuo Chen, Weikang Bian, Xiaoyu Shi, Fu-Yun Wang, Keqiang Sun, Hujun Bao, Zhaopeng Cui, Guofeng Zhang, Hongsheng Li
BlinkVision enables extensive benchmarks on three types of correspondence tasks (optical flow, point tracking, and scene flow estimation) for both image-based and event-based methods, offering new observations, practices, and insights for future research.
1 code implementation • 24 Oct 2024 • Fu-Yun Wang, Zhengyang Geng, Hongsheng Li
Diffusion models achieve superior generation quality but suffer from slow generation speed due to the iterative nature of denoising.
Ranked #9 on
Image Generation
on ImageNet 64x64
no code implementations • 9 Oct 2024 • Bohan Zeng, Ling Yang, Siyu Li, Jiaming Liu, Zixiang Zhang, Juanxi Tian, Kaixin Zhu, Yongzhen Guo, Fu-Yun Wang, Minkai Xu, Stefano Ermon, Wentao Zhang
Then we propose a geometry-aware 4D transition network to realize a complex scene-level 4D transition based on the plan, which involves expressive geometrical object deformation.
1 code implementation • 9 Oct 2024 • Fu-Yun Wang, Ling Yang, Zhaoyang Huang, Mengdi Wang, Hongsheng Li
Building on this insight, we propose Rectified Diffusion, which generalizes the design space and application scope of rectification to encompass the broader category of diffusion models, rather than being restricted to flow-matching models.
no code implementations • 17 Sep 2024 • Xiaofeng Mao, Zhengkai Jiang, Fu-Yun Wang, Wenbing Zhu, Jiangning Zhang, Hao Chen, Mingmin Chi, Yabiao Wang
Video diffusion models have shown great potential in generating high-quality videos, making them an increasingly popular focus.
1 code implementation • 5 Jun 2024 • Le Zhuo, Ruoyi Du, Han Xiao, Yangguang Li, Dongyang Liu, Rongjie Huang, Wenze Liu, Lirui Zhao, Fu-Yun Wang, Zhanyu Ma, Xu Luo, Zehan Wang, Kaipeng Zhang, Xiangyang Zhu, Si Liu, Xiangyu Yue, Dingning Liu, Wanli Ouyang, Ziwei Liu, Yu Qiao, Hongsheng Li, Peng Gao
Lumina-T2X is a nascent family of Flow-based Large Diffusion Transformers that establishes a unified framework for transforming noise into various modalities, such as images and videos, conditioned on text instructions.
1 code implementation • 28 May 2024 • Fu-Yun Wang, Zhaoyang Huang, Alexander William Bergman, Dazhong Shen, Peng Gao, Michael Lingelbach, Keqiang Sun, Weikang Bian, Guanglu Song, Yu Liu, Xiaogang Wang, Hongsheng Li
In this paper, we identify three key flaws in the current design of Latent Consistency Models (LCMs).
1 code implementation • CVPR 2024 • Dazhong Shen, Guanglu Song, Zeyue Xue, Fu-Yun Wang, Yu Liu
Classifier-Free Guidance (CFG) has been widely used in text-to-image diffusion models, where the CFG scale is introduced to control the strength of text guidance on the whole image space.
1 code implementation • 20 Mar 2024 • Fu-Yun Wang, Xiaoshi Wu, Zhaoyang Huang, Xiaoyu Shi, Dazhong Shen, Guanglu Song, Yu Liu, Hongsheng Li
We introduce MOTIA Mastering Video Outpainting Through Input-Specific Adaptation, a diffusion-based pipeline that leverages both the intrinsic data-specific patterns of the source video and the image/video generative prior for effective outpainting.
1 code implementation • 1 Feb 2024 • Fu-Yun Wang, Zhaoyang Huang, Weikang Bian, Xiaoyu Shi, Keqiang Sun, Guanglu Song, Yu Liu, Hongsheng Li
This paper introduces an effective method for computation-efficient personalized style video generation without requiring access to any personalized video data.
no code implementations • 29 Jan 2024 • Xiaoyu Shi, Zhaoyang Huang, Fu-Yun Wang, Weikang Bian, Dasong Li, Yi Zhang, Manyuan Zhang, Ka Chun Cheung, Simon See, Hongwei Qin, Jifeng Dai, Hongsheng Li
For the first stage, we propose a diffusion-based motion field predictor, which focuses on deducing the trajectories of the reference image's pixels.
1 code implementation • 29 May 2023 • Fu-Yun Wang, Wenshuo Chen, Guanglu Song, Han-Jia Ye, Yu Liu, Hongsheng Li
To address this challenge, we introduce a novel paradigm dubbed as Gen-L-Video, capable of extending off-the-shelf short video diffusion models for generating and editing videos comprising hundreds of frames with diverse semantic segments without introducing additional training, all while preserving content consistency.
2 code implementations • 10 Apr 2022 • Fu-Yun Wang, Da-Wei Zhou, Han-Jia Ye, De-Chuan Zhan
The ability to learn new concepts continually is necessary in this ever-changing world.
Ranked #1 on
Incremental Learning
on ImageNet100 - 20 steps
1 code implementation • CVPR 2022 • Da-Wei Zhou, Fu-Yun Wang, Han-Jia Ye, Liang Ma, ShiLiang Pu, De-Chuan Zhan
Forward compatibility requires future new classes to be easily incorporated into the current model based on the current stage data, and we seek to realize it by reserving embedding space for future new classes.
Ranked #6 on
Few-Shot Class-Incremental Learning
on CIFAR-100
class-incremental learning
Few-Shot Class-Incremental Learning
+1
1 code implementation • 23 Dec 2021 • Da-Wei Zhou, Fu-Yun Wang, Han-Jia Ye, De-Chuan Zhan
Traditional machine learning systems are deployed under the closed-world setting, which requires the entire training data before the offline training process.