Image Generation

1982 papers with code • 85 benchmarks • 67 datasets

Image Generation (synthesis) is the task of generating new images from an existing dataset.

  • Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
  • Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Libraries

Use these libraries to find Image Generation models and implementations

Latest papers with no code

Generating Counterfactual Trajectories with Latent Diffusion Models for Concept Discovery

no code yet • 16 Apr 2024

In the first step, CDCT uses a Latent Diffusion Model (LDM) to generate a counterfactual trajectory dataset.

OneActor: Consistent Character Generation via Cluster-Conditioned Guidance

no code yet • 16 Apr 2024

Comprehensive experiments show that our method outperforms a variety of baselines with satisfactory character consistency, superior prompt conformity as well as high image quality.

Adversarial Identity Injection for Semantic Face Image Synthesis

no code yet • 16 Apr 2024

Among all the explored techniques, Semantic Image Synthesis (SIS) methods, whose goal is to generate an image conditioned on a semantic segmentation mask, are the most promising, even though preserving the perceived identity of the input subject is not their main concern.

OmniSSR: Zero-shot Omnidirectional Image Super-Resolution using Stable Diffusion Model

no code yet • 16 Apr 2024

Omnidirectional images (ODIs) are commonly used in real-world visual tasks, and high-resolution ODIs help improve the performance of related visual tasks.

Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models

no code yet • 15 Apr 2024

Diffusion Models (DMs) have shown remarkable capabilities in various image-generation tasks.

In-Context Translation: Towards Unifying Image Recognition, Processing, and Generation

no code yet • 15 Apr 2024

Secondly, it standardizes the training of different tasks into a general in-context learning, where "in-context" means the input comprises an example input-output pair of the target task and a query image.

Zero-shot detection of buildings in mobile LiDAR using Language Vision Model

no code yet • 15 Apr 2024

Moreover, constructing LVMs for point clouds is even more challenging due to the requirements for large amounts of data and training time.

Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model

no code yet • 15 Apr 2024

Ctrl-Adapter provides diverse capabilities including image control, video control, video control with sparse frames, multi-condition control, compatibility with different backbones, adaptation to unseen control conditions, and video editing.

EdgeRelight360: Text-Conditioned 360-Degree HDR Image Generation for Real-Time On-Device Video Portrait Relighting

no code yet • 15 Apr 2024

In this paper, we present EdgeRelight360, an approach for real-time video portrait relighting on mobile devices, utilizing text-conditioned generation of 360-degree high dynamic range image (HDRI) maps.

Diffscaler: Enhancing the Generative Prowess of Diffusion Transformers

no code yet • 15 Apr 2024

As these parameters are independent, a single diffusion model with these task-specific parameters can be used to perform multiple tasks simultaneously.