TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Image Generation	ImageNet 256x256	GigaGAN	FID	3.45	# 18
Text-to-Image Generation	MS COCO	GigaGAN (Zero-shot, 64x64)	FID	7.28	# 18
Text-to-Image Generation	MS COCO	GigaGAN (Zero-shot, 256x256)	FID	9.09	# 24

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/scaling-up-gans-for-text-to-image-synthesis/image-generation-on-imagenet-256x256)](https://paperswithcode.com/sota/image-generation-on-imagenet-256x256?p=scaling-up-gans-for-text-to-image-synthesis)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/scaling-up-gans-for-text-to-image-synthesis/text-to-image-generation-on-coco)](https://paperswithcode.com/sota/text-to-image-generation-on-coco?p=scaling-up-gans-for-text-to-image-synthesis)`

Scaling up GANs for Text-to-Image Synthesis

CVPR 2023 · Minguk Kang, Jun-Yan Zhu, Richard Zhang, Jaesik Park, Eli Shechtman, Sylvain Paris, Taesung Park ·

The recent success of text-to-image synthesis has taken the world by storm and captured the general public's imagination. From a technical standpoint, it also marked a drastic change in the favored architecture to design generative image models. GANs used to be the de facto choice, with techniques like StyleGAN. With DALL-E 2, auto-regressive and diffusion models became the new standard for large-scale generative models overnight. This rapid shift raises a fundamental question: can we scale up GANs to benefit from large datasets like LAION? We find that na\"Ively increasing the capacity of the StyleGAN architecture quickly becomes unstable. We introduce GigaGAN, a new GAN architecture that far exceeds this limit, demonstrating GANs as a viable option for text-to-image synthesis. GigaGAN offers three major advantages. First, it is orders of magnitude faster at inference time, taking only 0.13 seconds to synthesize a 512px image. Second, it can synthesize high-resolution images, for example, 16-megapixel pixels in 3.66 seconds. Finally, GigaGAN supports various latent space editing applications such as latent interpolation, style mixing, and vector arithmetic operations.

PDF Abstract CVPR 2023 PDF CVPR 2023 Abstract

Code

Add Remove Mark official

lucidrains/gigagan-pytorch

1,587

Tasks

Add Remove

Image Generation

Text-to-Image Generation

Datasets

ImageNet

MS COCO

Results from the Paper

Edit

Ranked #18 on Image Generation on ImageNet 256x256

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Image Generation	ImageNet 256x256	GigaGAN	FID	3.45	# 18	Compare
Text-to-Image Generation	MS COCO	GigaGAN (Zero-shot, 64x64)	FID	7.28	# 18	Compare
Text-to-Image Generation	MS COCO	GigaGAN (Zero-shot, 256x256)	FID	9.09	# 24	Compare

Methods

Add Remove

Adaptive Instance Normalization • Convolution • Dense Connections • Diffusion • Feedforward Network • Leaky ReLU • R1 Regularization • StyleGAN

Edit Social Preview

Scaling up GANs for Text-to-Image Synthesis

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove