Search Results for author: Shuyang Gu

Found 17 papers, 13 papers with code

Simplified Diffusion Schrödinger Bridge

1 code implementation • 21 Mar 2024 • Zhicong Tang, Tiankai Hang, Shuyang Gu, Dong Chen, Baining Guo

This paper introduces a novel theoretical simplification of the Diffusion Schr\"odinger Bridge (DSB) that facilitates its unification with Score-based Generative Models (SGMs), addressing the limitations of DSB in complex data generation and enabling faster convergence and enhanced performance.

Paper
Code

CCA: Collaborative Competitive Agents for Image Editing

1 code implementation • 23 Jan 2024 • Tiankai Hang, Shuyang Gu, Dong Chen, Xin Geng, Baining Guo

This paper presents a novel generative model, Collaborative Competitive Agents (CCA), which leverages the capabilities of multiple Large Language Models (LLMs) based agents to execute complex tasks.

Paper
Code

VolumeDiffusion: Flexible Text-to-3D Generation with Efficient Volumetric Encoder

1 code implementation • 18 Dec 2023 • Zhicong Tang, Shuyang Gu, Chunyu Wang, Ting Zhang, Jianmin Bao, Dong Chen, Baining Guo

The 3D volumes are then trained on a diffusion model for text-to-3D generation using a 3D U-Net.

3D Generation Object +1

Paper
Code

InstructDiffusion: A Generalist Modeling Interface for Vision Tasks

1 code implementation • 7 Sep 2023 • Zigang Geng, Binxin Yang, Tiankai Hang, Chen Li, Shuyang Gu, Ting Zhang, Jianmin Bao, Zheng Zhang, Han Hu, Dong Chen, Baining Guo

We present InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.

Keypoint Detection

350

Paper
Code

Efficient Diffusion Training via Min-SNR Weighting Strategy

2 code implementations • ICCV 2023 • Tiankai Hang, Shuyang Gu, Chen Li, Jianmin Bao, Dong Chen, Han Hu, Xin Geng, Baining Guo

Denoising diffusion models have been a mainstream approach for image generation, however, training these models often suffers from slow convergence.

Ranked #2 on Image Generation on ImageNet 256x256

Denoising Image Generation +2

183

Paper
Code

CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet

1 code implementation • 12 Dec 2022 • Xiaoyi Dong, Jianmin Bao, Ting Zhang, Dongdong Chen, Shuyang Gu, Weiming Zhang, Lu Yuan, Dong Chen, Fang Wen, Nenghai Yu

Recent studies have shown that CLIP has achieved remarkable success in performing zero-shot inference while its fine-tuning performance is not satisfactory.

192

Paper
Code

Rodin: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion

no code implementations • CVPR 2023 • Tengfei Wang, Bo Zhang, Ting Zhang, Shuyang Gu, Jianmin Bao, Tadas Baltrusaitis, Jingjing Shen, Dong Chen, Fang Wen, Qifeng Chen, Baining Guo

This paper presents a 3D generative model that uses diffusion models to automatically generate 3D digital avatars represented as neural radiance fields.

Computational Efficiency

Paper
Add Code

Paint by Example: Exemplar-based Image Editing with Diffusion Models

2 code implementations • CVPR 2023 • Binxin Yang, Shuyang Gu, Bo Zhang, Ting Zhang, Xuejin Chen, Xiaoyan Sun, Dong Chen, Fang Wen

Language-guided image editing has achieved great success recently.

Image Generation Image Manipulation

998

Paper
Code

Improved Vector Quantized Diffusion Models

1 code implementation • 31 May 2022 • Zhicong Tang, Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen

When trained on ImageNet, we dramatically improve the FID score from 11. 89 to 4. 83, demonstrating the superiority of our proposed techniques.

Denoising Image Generation

849

Paper
Code

StyleSwin: Transformer-based GAN for High-resolution Image Generation

1 code implementation • CVPR 2022 • BoWen Zhang, Shuyang Gu, Bo Zhang, Jianmin Bao, Dong Chen, Fang Wen, Yong Wang, Baining Guo

To this end, we believe that local attention is crucial to strike the balance between computational efficiency and modeling capacity.

Ranked #1 on Image Generation on CelebA 256x256 (FID metric)

Blocking Computational Efficiency +3

484

Paper
Code

Vector Quantized Diffusion Model for Text-to-Image Synthesis

2 code implementations • CVPR 2022 • Shuyang Gu, Dong Chen, Jianmin Bao, Fang Wen, Bo Zhang, Dongdong Chen, Lu Yuan, Baining Guo

Our experiments indicate that the VQ-Diffusion model with the reparameterization is fifteen times faster than traditional AR methods while achieving a better image quality.

Ranked #1 on Text-to-Image Generation on Oxford 102 Flowers (using extra training data)

Denoising Text-to-Image Generation

849

Paper
Code

High-Fidelity and Arbitrary Face Editing

no code implementations • CVPR 2021 • Yue Gao, Fangyun Wei, Jianmin Bao, Shuyang Gu, Dong Chen, Fang Wen, Zhouhui Lian

However, we observe that the generator tends to find a tricky way to hide information from the original image to satisfy the constraint of cycle consistency, making it impossible to maintain the rich details (e. g., wrinkles and moles) of non-editing areas.

Attribute Vocal Bursts Intensity Prediction

Paper
Add Code

Learnable Sampling 3D Convolution for Video Enhancement and Action Recognition

no code implementations • 22 Nov 2020 • Shuyang Gu, Jianmin Bao, Dong Chen

A key challenge in video enhancement and action recognition is to fuse useful information from neighboring frames.

Action Recognition Denoising +3

Paper
Add Code

PriorGAN: Real Data Prior for Generative Adversarial Nets

1 code implementation • 30 Jun 2020 • Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen

To address these two issues, we propose a novel prior that captures the whole real data distribution for GANs, which are called PriorGANs.

212

Paper
Code

GIQA: Generated Image Quality Assessment

1 code implementation • ECCV 2020 • Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen

Generative adversarial networks (GANs) have achieved impressive results today, but not all generated images are perfect.

Image Quality Assessment

212

Paper
Code

Mask-Guided Portrait Editing with Conditional GANs

no code implementations • CVPR 2019 • Shuyang Gu, Jianmin Bao, Hao Yang, Dong Chen, Fang Wen, Lu Yuan

Portrait editing is a popular subject in photo manipulation.

Data Augmentation Face Generation +3

Paper
Add Code

Arbitrary Style Transfer with Deep Feature Reshuffle

1 code implementation • CVPR 2018 • Shuyang Gu, Congliang Chen, Jing Liao, Lu Yuan

We theoretically prove that our new style loss based on reshuffle connects both global and local style losses respectively used by most parametric and non-parametric neural style transfer methods.

Style Transfer

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.