34 papers with code • 8 benchmarks • 5 datasets
In this paper, we propose a novel controllable text-to-image generative adversarial network (ControlGAN), which can effectively synthesise high-quality images and also control parts of the image generation according to natural language descriptions.
To address these challenges we introduce a new model that explicitly models individual objects within an image and a new evaluation metric called Semantic Object Accuracy (SOA) that specifically evaluates images given an image caption.
In this work, we propose TediGAN, a novel framework for multi-modal image generation and manipulation with textual descriptions.
To be specific, we propose a brand new paradigm of text-guided image generation and manipulation based on the superior characteristics of a pretrained GAN model.
Recently, vector-quantized image modeling has demonstrated impressive performance on generation tasks such as text-to-image generation.
This block enables MC-GAN to generate a realistic object image with the desired background by controlling the amount of the background information from the given base image using the foreground information from the text attributes.
Electro-optic modulators transform electronic signals into the optical domain and are critical components in modern telecommunication networks, RF photonics, and emerging applications in quantum photonics and beam steering.
Our experiments show that through the use of the object pathway we can control object locations within images and can model complex scenes with multiple objects at various locations.
Given a text description, we immediately imagine an overall visual impression using this prior and, based on this, we draw a picture by progressively adding more and more details.