TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Text-to-Image Generation	CUB	Attention-driven Generator (perceptual loss)	Inception score	4.58	# 7
Text-to-Image Generation	Multi-Modal-CelebA-HQ	ControlGAN	FID	116.32	# 7
Text-to-Image Generation	Multi-Modal-CelebA-HQ	ControlGAN	LPIPS	0.522	# 5
Text-to-Image Generation	Multi-Modal-CelebA-HQ	ControlGAN	Acc	14.6	# 5
Text-to-Image Generation	Multi-Modal-CelebA-HQ	ControlGAN	Real	13.1	# 5

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/controllable-text-to-image-generation/text-to-image-generation-on-cub)](https://paperswithcode.com/sota/text-to-image-generation-on-cub?p=controllable-text-to-image-generation)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/controllable-text-to-image-generation/text-to-image-generation-on-multi-modal)](https://paperswithcode.com/sota/text-to-image-generation-on-multi-modal?p=controllable-text-to-image-generation)`

Controllable Text-to-Image Generation

NeurIPS 2019 · Bowen Li, Xiaojuan Qi, Thomas Lukasiewicz, Philip H. S. Torr ·

In this paper, we propose a novel controllable text-to-image generative adversarial network (ControlGAN), which can effectively synthesise high-quality images and also control parts of the image generation according to natural language descriptions. To achieve this, we introduce a word-level spatial and channel-wise attention-driven generator that can disentangle different visual attributes, and allow the model to focus on generating and manipulating subregions corresponding to the most relevant words. Also, a word-level discriminator is proposed to provide fine-grained supervisory feedback by correlating words with image regions, facilitating training an effective generator which is able to manipulate specific visual attributes without affecting the generation of other content. Furthermore, perceptual loss is adopted to reduce the randomness involved in the image generation, and to encourage the generator to manipulate specific attributes required in the modified text. Extensive experiments on benchmark datasets demonstrate that our method outperforms existing state of the art, and is able to effectively manipulate synthetic images using natural language descriptions. Code is available at https://github.com/mrlibw/ControlGAN.

PDF Abstract NeurIPS 2019 PDF NeurIPS 2019 Abstract

Code

Add Remove Mark official

mrlibw/ControlGAN official

163

taki0112/ControlGAN-Tensorflow

Tasks

Add Remove

Generative Adversarial Network

Image Generation

Text-to-Image Generation

Datasets

MS COCO

CUB-200-2011

Multi-Modal CelebA-HQ

Results from the Paper

Edit

Ranked #7 on Text-to-Image Generation on Multi-Modal-CelebA-HQ

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Text-to-Image Generation	CUB	Attention-driven Generator (perceptual loss)	Inception score	4.58	# 7	Compare
Text-to-Image Generation	Multi-Modal-CelebA-HQ	ControlGAN	FID	116.32	# 7	Compare
			LPIPS	0.522	# 5	Compare
			Acc	14.6	# 5	Compare
			Real	13.1	# 5	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Controllable Text-to-Image Generation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove