1 code implementation • 21 Mar 2024 • Zhicong Tang, Tiankai Hang, Shuyang Gu, Dong Chen, Baining Guo
This paper introduces a novel theoretical simplification of the Diffusion Schr\"odinger Bridge (DSB) that facilitates its unification with Score-based Generative Models (SGMs), addressing the limitations of DSB in complex data generation and enabling faster convergence and enhanced performance.
1 code implementation • 23 Jan 2024 • Tiankai Hang, Shuyang Gu, Dong Chen, Xin Geng, Baining Guo
This paper presents a novel generative model, Collaborative Competitive Agents (CCA), which leverages the capabilities of multiple Large Language Models (LLMs) based agents to execute complex tasks.
1 code implementation • 18 Dec 2023 • Zhicong Tang, Shuyang Gu, Chunyu Wang, Ting Zhang, Jianmin Bao, Dong Chen, Baining Guo
The 3D volumes are then trained on a diffusion model for text-to-3D generation using a 3D U-Net.
1 code implementation • 7 Sep 2023 • Zigang Geng, Binxin Yang, Tiankai Hang, Chen Li, Shuyang Gu, Ting Zhang, Jianmin Bao, Zheng Zhang, Han Hu, Dong Chen, Baining Guo
We present InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.
2 code implementations • ICCV 2023 • Tiankai Hang, Shuyang Gu, Chen Li, Jianmin Bao, Dong Chen, Han Hu, Xin Geng, Baining Guo
Denoising diffusion models have been a mainstream approach for image generation, however, training these models often suffers from slow convergence.
Ranked #2 on Image Generation on ImageNet 256x256
1 code implementation • 12 Dec 2022 • Xiaoyi Dong, Jianmin Bao, Ting Zhang, Dongdong Chen, Shuyang Gu, Weiming Zhang, Lu Yuan, Dong Chen, Fang Wen, Nenghai Yu
Recent studies have shown that CLIP has achieved remarkable success in performing zero-shot inference while its fine-tuning performance is not satisfactory.
no code implementations • CVPR 2023 • Tengfei Wang, Bo Zhang, Ting Zhang, Shuyang Gu, Jianmin Bao, Tadas Baltrusaitis, Jingjing Shen, Dong Chen, Fang Wen, Qifeng Chen, Baining Guo
This paper presents a 3D generative model that uses diffusion models to automatically generate 3D digital avatars represented as neural radiance fields.
2 code implementations • CVPR 2023 • Binxin Yang, Shuyang Gu, Bo Zhang, Ting Zhang, Xuejin Chen, Xiaoyan Sun, Dong Chen, Fang Wen
Language-guided image editing has achieved great success recently.
1 code implementation • 31 May 2022 • Zhicong Tang, Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen
When trained on ImageNet, we dramatically improve the FID score from 11. 89 to 4. 83, demonstrating the superiority of our proposed techniques.
1 code implementation • CVPR 2022 • BoWen Zhang, Shuyang Gu, Bo Zhang, Jianmin Bao, Dong Chen, Fang Wen, Yong Wang, Baining Guo
To this end, we believe that local attention is crucial to strike the balance between computational efficiency and modeling capacity.
Ranked #1 on Image Generation on CelebA 256x256 (FID metric)
2 code implementations • CVPR 2022 • Shuyang Gu, Dong Chen, Jianmin Bao, Fang Wen, Bo Zhang, Dongdong Chen, Lu Yuan, Baining Guo
Our experiments indicate that the VQ-Diffusion model with the reparameterization is fifteen times faster than traditional AR methods while achieving a better image quality.
Ranked #1 on Text-to-Image Generation on Oxford 102 Flowers (using extra training data)
no code implementations • CVPR 2021 • Yue Gao, Fangyun Wei, Jianmin Bao, Shuyang Gu, Dong Chen, Fang Wen, Zhouhui Lian
However, we observe that the generator tends to find a tricky way to hide information from the original image to satisfy the constraint of cycle consistency, making it impossible to maintain the rich details (e. g., wrinkles and moles) of non-editing areas.
no code implementations • 22 Nov 2020 • Shuyang Gu, Jianmin Bao, Dong Chen
A key challenge in video enhancement and action recognition is to fuse useful information from neighboring frames.
1 code implementation • 30 Jun 2020 • Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen
To address these two issues, we propose a novel prior that captures the whole real data distribution for GANs, which are called PriorGANs.
1 code implementation • ECCV 2020 • Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen
Generative adversarial networks (GANs) have achieved impressive results today, but not all generated images are perfect.
no code implementations • CVPR 2019 • Shuyang Gu, Jianmin Bao, Hao Yang, Dong Chen, Fang Wen, Lu Yuan
Portrait editing is a popular subject in photo manipulation.
1 code implementation • CVPR 2018 • Shuyang Gu, Congliang Chen, Jing Liao, Lu Yuan
We theoretically prove that our new style loss based on reshuffle connects both global and local style losses respectively used by most parametric and non-parametric neural style transfer methods.