Conditional Text-to-Image Synthesis
7 papers with code • 2 benchmarks • 1 datasets
Introducing extra conditions based on the text-to-image generation process, similar to the paradigm of ControlNet.
Most implemented papers
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
As such paired data is time-consuming and labor-intensive to acquire and restricted to a closed set, this potentially becomes the bottleneck for applications in an open world.
GLIGEN: Open-Set Grounded Text-to-Image Generation
Large-scale text-to-image diffusion models have made amazing advances.
LaCon: Late-Constraint Diffusion for Steerable Guided Image Synthesis
To offer more controllability for the generation process, existing studies, termed as early-constraint methods in this paper, leverage extra conditions and incorporate them into pre-trained diffusion models.
Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models
Text-to-image (T2I) research has grown explosively in the past year, owing to the large-scale pre-trained diffusion models and many emerging personalization and editing approaches.
InstanceDiffusion: Instance-level Control for Image Generation
Text-to-image diffusion models produce high quality images but do not offer control over individual instances in the image.
MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis
Lastly, we aggregate all the shaded instances to provide the necessary information for accurately generating multiple instances in stable diffusion (SD).
Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis
We present Zero-Painter, a novel training-free framework for layout-conditional text-to-image synthesis that facilitates the creation of detailed and controlled imagery from textual prompts.