Search Results for author: Shuyang Gu

Found 19 papers, 13 papers with code

FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation

no code implementations12 Jun 2024 Xinzhi Mu, Li Chen, Bohan Chen, Shuyang Gu, Jianmin Bao, Dong Chen, Ji Li, Yuhui Yuan

This task essentially requires generating coherent and consistent visual content within the confines of a font-shaped canvas, as opposed to a traditional rectangular canvas.

Text-to-Image Generation

Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step

no code implementations6 Jun 2024 Zhanhao Liang, Yuhui Yuan, Shuyang Gu, Bohan Chen, Tiankai Hang, Ji Li, Liang Zheng

To address this limitation, we propose Step-aware Preference Optimization (SPO), a novel post-training approach that independently evaluates and adjusts the denoising performance at each step, using a step-aware preference model and a step-wise resampler to ensure accurate step-aware supervision.


Simplified Diffusion Schrödinger Bridge

1 code implementation21 Mar 2024 Zhicong Tang, Tiankai Hang, Shuyang Gu, Dong Chen, Baining Guo

This paper introduces a novel theoretical simplification of the Diffusion Schr\"odinger Bridge (DSB) that facilitates its unification with Score-based Generative Models (SGMs), addressing the limitations of DSB in complex data generation and enabling faster convergence and enhanced performance.

CCA: Collaborative Competitive Agents for Image Editing

1 code implementation23 Jan 2024 Tiankai Hang, Shuyang Gu, Dong Chen, Xin Geng, Baining Guo

This paper presents a novel generative model, Collaborative Competitive Agents (CCA), which leverages the capabilities of multiple Large Language Models (LLMs) based agents to execute complex tasks.

Efficient Diffusion Training via Min-SNR Weighting Strategy

2 code implementations ICCV 2023 Tiankai Hang, Shuyang Gu, Chen Li, Jianmin Bao, Dong Chen, Han Hu, Xin Geng, Baining Guo

Denoising diffusion models have been a mainstream approach for image generation, however, training these models often suffers from slow convergence.

Denoising Image Generation +2

Rodin: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion

no code implementations CVPR 2023 Tengfei Wang, Bo Zhang, Ting Zhang, Shuyang Gu, Jianmin Bao, Tadas Baltrusaitis, Jingjing Shen, Dong Chen, Fang Wen, Qifeng Chen, Baining Guo

This paper presents a 3D generative model that uses diffusion models to automatically generate 3D digital avatars represented as neural radiance fields.

Computational Efficiency

CLIP Itself is a Strong Fine-tuner: Achieving 85.7% and 88.0% Top-1 Accuracy with ViT-B and ViT-L on ImageNet

1 code implementation12 Dec 2022 Xiaoyi Dong, Jianmin Bao, Ting Zhang, Dongdong Chen, Shuyang Gu, Weiming Zhang, Lu Yuan, Dong Chen, Fang Wen, Nenghai Yu

Recent studies have shown that CLIP has achieved remarkable success in performing zero-shot inference while its fine-tuning performance is not satisfactory.

Improved Vector Quantized Diffusion Models

1 code implementation31 May 2022 Zhicong Tang, Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen

When trained on ImageNet, we dramatically improve the FID score from 11. 89 to 4. 83, demonstrating the superiority of our proposed techniques.

Denoising Image Generation

StyleSwin: Transformer-based GAN for High-resolution Image Generation

1 code implementation CVPR 2022 BoWen Zhang, Shuyang Gu, Bo Zhang, Jianmin Bao, Dong Chen, Fang Wen, Yong Wang, Baining Guo

To this end, we believe that local attention is crucial to strike the balance between computational efficiency and modeling capacity.

 Ranked #1 on Image Generation on CelebA 256x256 (FID metric)

Blocking Computational Efficiency +3

Vector Quantized Diffusion Model for Text-to-Image Synthesis

2 code implementations CVPR 2022 Shuyang Gu, Dong Chen, Jianmin Bao, Fang Wen, Bo Zhang, Dongdong Chen, Lu Yuan, Baining Guo

Our experiments indicate that the VQ-Diffusion model with the reparameterization is fifteen times faster than traditional AR methods while achieving a better image quality.

 Ranked #1 on Text-to-Image Generation on Oxford 102 Flowers (using extra training data)

Denoising Text-to-Image Generation

High-Fidelity and Arbitrary Face Editing

no code implementations CVPR 2021 Yue Gao, Fangyun Wei, Jianmin Bao, Shuyang Gu, Dong Chen, Fang Wen, Zhouhui Lian

However, we observe that the generator tends to find a tricky way to hide information from the original image to satisfy the constraint of cycle consistency, making it impossible to maintain the rich details (e. g., wrinkles and moles) of non-editing areas.

Attribute Vocal Bursts Intensity Prediction

Learnable Sampling 3D Convolution for Video Enhancement and Action Recognition

no code implementations22 Nov 2020 Shuyang Gu, Jianmin Bao, Dong Chen

A key challenge in video enhancement and action recognition is to fuse useful information from neighboring frames.

Action Recognition Denoising +3

PriorGAN: Real Data Prior for Generative Adversarial Nets

1 code implementation30 Jun 2020 Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen

To address these two issues, we propose a novel prior that captures the whole real data distribution for GANs, which are called PriorGANs.

GIQA: Generated Image Quality Assessment

1 code implementation ECCV 2020 Shuyang Gu, Jianmin Bao, Dong Chen, Fang Wen

Generative adversarial networks (GANs) have achieved impressive results today, but not all generated images are perfect.

Image Quality Assessment

Arbitrary Style Transfer with Deep Feature Reshuffle

1 code implementation CVPR 2018 Shuyang Gu, Congliang Chen, Jing Liao, Lu Yuan

We theoretically prove that our new style loss based on reshuffle connects both global and local style losses respectively used by most parametric and non-parametric neural style transfer methods.

Style Transfer

Cannot find the paper you are looking for? You can Submit a new open access paper.