Browse > Computer Vision > Image Generation > Text-to-Image Generation

Text-to-Image Generation

7 papers with code · Computer Vision
Subtask of Image Generation

Text-to-image generation is the task of generating images from text descriptions or captions.

State-of-the-art leaderboards

Greatest papers with code

StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

19 Oct 2017hanzhanggit/StackGAN

In this paper, we propose Stacked Generative Adversarial Networks (StackGAN) aiming at generating high-resolution photo-realistic images. The Stage-I GAN sketches the primitive shape and colors of the object based on given text description, yielding low-resolution images.


StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

ICCV 2017 hanzhanggit/StackGAN

Synthesizing high-quality images from text descriptions is a challenging problem in computer vision and has many practical applications. Samples generated by existing text-to-image approaches can roughly reflect the meaning of the given descriptions, but they fail to contain necessary details and vivid object parts.


Generative Adversarial Text to Image Synthesis

17 May 2016hanzhanggit/StackGAN

Automatic synthesis of realistic images from text would be interesting and useful, but current AI systems are still far from this goal. However, in recent years generic and powerful recurrent neural network architectures have been developed to learn discriminative text feature representations.


AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks

CVPR 2018 hanzhanggit/StackGAN-v2

In this paper, we propose an Attentional Generative Adversarial Network (AttnGAN) that allows attention-driven, multi-stage refinement for fine-grained text-to-image generation. With a novel attentional generative network, the AttnGAN can synthesize fine-grained details at different subregions of the image by paying attentions to the relevant words in the natural language description.


Generating Images from Captions with Attention

9 Nov 2015emansim/text2image

Motivated by the recent progress in generative models, we introduce a model that generates images from natural language descriptions. The proposed model iteratively draws patches on a canvas, while attending to the relevant words in the description.


Generating Multiple Objects at Spatially Distinct Locations

ICLR 2019 tohinz/multiple-objects-gan

The object pathway focuses solely on the individual objects and is iteratively applied at the locations specified by the bounding boxes. Our experiments show that through the use of the object pathway we can control object locations within images and can model complex scenes with multiple objects at various locations.


MC-GAN: Multi-conditional Generative Adversarial Network for Image Synthesis


To tackle the problem, we propose a multi-conditional GAN (MC-GAN) which controls both the object and background information jointly. This block enables MC-GAN to generate a realistic object image with the desired background by controlling the amount of the background information from the given base image using the foreground information from the text attributes.