Image Generation

1989 papers with code • 85 benchmarks • 67 datasets

Image Generation (synthesis) is the task of generating new images from an existing dataset.

  • Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
  • Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Libraries

Use these libraries to find Image Generation models and implementations

Latest papers with no code

SkinGEN: an Explainable Dermatology Diagnosis-to-Generation Framework with Interactive Vision-Language Models

no code yet • 23 Apr 2024

With the continuous advancement of vision language models (VLMs) technology, remarkable research achievements have emerged in the dermatology field, the fourth most prevalent human disease category.

From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation

no code yet • 23 Apr 2024

Addressing this, we introduce Parts2Whole, a novel framework designed for generating customized portraits from multiple reference images, including pose images and various aspects of human appearance.

Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation

no code yet • 23 Apr 2024

Recent studies have demonstrated the exceptional potentials of leveraging human preference datasets to refine text-to-image generative models, enhancing the alignment between generated images and textual prompts.

FINEMATCH: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction

no code yet • 23 Apr 2024

To address this, we propose FineMatch, a new aspect-based fine-grained text and image matching benchmark, focusing on text and image mismatch detection and correction.

ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning

no code yet • 23 Apr 2024

The rapid development of diffusion models has triggered diverse applications.

GLoD: Composing Global Contexts and Local Details in Image Generation

no code yet • 23 Apr 2024

However, simultaneous control over both global contexts (e. g., object layouts and interactions) and local details (e. g., colors and emotions) still remains a significant challenge.

Towards Better Text-to-Image Generation Alignment via Attention Modulation

no code yet • 22 Apr 2024

To achieve this, we incorporate a temperature control mechanism within the early phases of the self-attention modules to mitigate entity leakage issues.

Accelerating Image Generation with Sub-path Linear Approximation Model

no code yet • 22 Apr 2024

Diffusion models have significantly advanced the state of the art in image, audio, and video generation tasks.

Concept Arithmetics for Circumventing Concept Inhibition in Diffusion Models

no code yet • 21 Apr 2024

We use compositional property of diffusion models, which allows to leverage multiple prompts in a single image generation.

LTOS: Layout-controllable Text-Object Synthesis via Adaptive Cross-attention Fusions

no code yet • 21 Apr 2024

Controllable text-to-image generation synthesizes visual text and objects in images with certain conditions, which are frequently applied to emoji and poster generation.