Text-to-image generation is the task of generating images from text descriptions or captions.
|Trend||Dataset||Best Method||Paper title||Paper||Code||Compare|
In this paper, we propose Stacked Generative Adversarial Networks (StackGAN) aiming at generating high-resolution photo-realistic images.
Synthesizing high-quality images from text descriptions is a challenging problem in computer vision and has many practical applications.
In this paper, we propose an Attentional Generative Adversarial Network (AttnGAN) that allows attention-driven, multi-stage refinement for fine-grained text-to-image generation.
SOTA for Text-to-Image Generation on CUB
Our experiments show that through the use of the object pathway we can control object locations within images and can model complex scenes with multiple objects at various locations.
SOTA for Text-to-Image Generation on MS-COCO
This block enables MC-GAN to generate a realistic object image with the desired background by controlling the amount of the background information from the given base image using the foreground information from the text attributes.
Conditional text-to-image generation is an active area of research, with many possible applications.
Generating an image from a given text description has two goals: visual realism and semantic consistency.