16 papers with code • 7 benchmarks • 4 datasets
Layout-to-image generation its the task to generate a scene based on the given layout. The layout describes the location of the objects to be included in the output image. In this section, you can find state-of-the-art leaderboards for Layout-to-image generation.
By decomposing the image formation process into a sequential application of denoising autoencoders, diffusion models (DMs) achieve state-of-the-art synthesis results on image data and beyond.
Despite remarkable recent progress on both unconditional and conditional image synthesis, it remains a long-standing problem to learn generative models that are capable of synthesizing realistic and sharp images from reconfigurable spatial layout (i. e., bounding boxes + class labels in an image lattice) and style (i. e., structural and appearance variations encoded by latent vectors), especially at high resolution.
This paper focuses on a recent emerged task, layout-to-image, to learn generative models that are capable of synthesizing photo-realistic images from spatial layout (i. e., object bounding boxes configured in an image lattice) and style (i. e., structural and appearance variations encoded by latent vectors).
Generating realistic images of complex visual scenes becomes challenging when one wishes to control the structure of the generated images.
In particular, layout-to-image generation models have gained significant attention due to their capability to generate realistic complex images containing distinct objects.
We argue that these are caused by the lack of context-aware object and stuff feature encoding in their generators, and location-sensitive appearance representation in their discriminators.
In this paper, we propose a method for attribute controlled image synthesis from layout which allows to specify the appearance of individual objects without affecting the rest of the image.