Image Generation

1982 papers with code • 85 benchmarks • 67 datasets

Image Generation (synthesis) is the task of generating new images from an existing dataset.

Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.

In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.

( Image credit: StyleGAN )

Benchmarks

Add a Result

These leaderboards are used to track progress in Image Generation

Dataset	Best Model	Compare
CIFAR-10	StyleSAN-XL	See all
ImageNet 64x64	RIN	See all
ImageNet 256x256	ViT-XL/2 with limited Interval Guidance	See all
FFHQ 256 x 256	StyleSAN-XL	See all
CelebA 64x64	DDPM-IP	See all
LSUN Bedroom 256 x 256	Diffusion ProjectedGAN	See all
ImageNet 32x32	StyleGAN-XL	See all
STL-10	Diffusion ProjectedGAN	See all
LSUN Churches 256 x 256	Projected GAN	See all
ImageNet 512x512	EDM2-XXL	See all
FFHQ 1024 x 1024	StyleSAN-XL	See all
CelebA 256x256	Efficient-VDVAE	See all
ImageNet 128x128	VDM++	See all
CelebA-HQ 256x256	RDM	See all
FFHQ-U	Alias-Free-R	See all
MNIST	Locally Masked PixelCNN (8 orders)	See all
CelebA-HQ 1024x1024	RDM	See all
Binarized MNIST	CR-NVAE	See all
LSUN Cat 256 x 256	Vision-aided GAN	See all
CelebA-HQ 128x128	U-Net GAN	See all
CIFAR-100	LeCAM (StyleGAN2 + ADA)	See all
AFHQV2	Polarity-StyleGAN3	See all
AFHQ Cat	Vision-aided GAN	See all
LSUN Horse 256 x 256	Vision-aided GAN	See all
CLEVR	Projected GAN	See all
Cityscapes	Projected GAN	See all
AFHQ Dog	Projected GAN	See all
Fashion-MNIST	PAE	See all
CelebA 128x128	U-Net GAN	See all
AFHQ Wild	Vision-aided GAN	See all
Places50	SinDiffusion	See all
CUB 128 x 128	Projected GAN	See all
Stanford Dogs	Projected GAN	See all
Stanford Cars	Projected GANs	See all
Pokemon 256x256	StyleGAN-XL	See all
VizDoom	GAUDI	See all
Replica	GAUDI	See all
VLN-CE	GAUDI	See all
ARKitScenes	GAUDI	See all
CAT 256x256	StyleGAN2 + DA + RLC (Ours)	See all
ADE-Indoor	Projected GAN	See all
Stacked MNIST	VAEBM	See all
CelebA-HQ 64x64	VAEBM	See all
CIFAR-10 (20% data)	DiffAugment-CR-BigGAN	See all
CIFAR-10 (10% data)	DiffAugment-StyleGAN2	See all
LSUN Bedroom	StyleGAN	See all
FFHQ 512 x 512	StyleSAN-XL	See all
FFHQ 128 x 128	DDPM-IP	See all
ObjectsRoom	GENESIS-V2	See all
ShapeStacks	GENESIS-V2	See all
MetFaces-U	Alias-Free-R	See all
MetFaces	t-Stylegan3-ada (NVIDIA pre-trained)	See all
Pokemon 1024x1024	StyleGAN-XL	See all
Oxford 102 Flowers 256 x 256	Projected GAN	See all
LSUN Car 512 x 384	Polarity-StyleGAN2	See all
LSUN Bedroom 64 x 64	WGAN-GP + TT Update Rule	See all
LSUN Bedroom 128 x 128	LadaGAN	See all
RC-49	cDRE-F-cSP+RS	See all
iNaturalist 2019	StyeGAN2 + NoisyTwins	See all
Cityscapes-5K 256x512	SB-GAN	See all
Cityscapes-25K 256x512	SB-GAN	See all
Indian Celebs 256 x 256	MSG-StyleGAN	See all
LSUN Car 256 x 256	StyleGAN2	See all
Multi-dSprites	GENESIS	See all
GQN	GENESIS	See all
Landscapes 256 x 256	CIPS	See all
Satellite-Buildings 256 x 256	CIPS	See all
Satellite-Landscapes 256 x 256	CIPS	See all
Oxford 102 Flowers 128x128	QSNGAN	See all
25% ImageNet 128x128	LeCAM + DA	See all
LLVIP	pix2pix	See all
SDSS Galaxies	AstroDDPM	See all
NASA Perseverance	Stylegan2-ada	See all
1,078 People 3D Faces Collection Data	Sessiz çığlık	See all
LSUN	BigGAN + gSR	See all
CelebA-HQ 512x512	RDM	See all
CelebA	PR-BigGAN - Recall	See all
LSUN tower 64x64	DDPM-IP	See all
FFHQ 64x64 - 4x upscaling	PFGM++	See all
KMNIST	Spiking-Diffusion	See all
EMNIST-Letters	Spiking-Diffusion	See all
ImageNet 256x256 - 1 labeled data per class	DPT	See all
ImageNet 256x256 - 2 labeled data per class	DPT	See all
ImageNet 256x256 - 5 labeled data per class	DPT	See all
ImageNet 256x256 - 1% labeled data	DPT	See all

Show all 85 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Image Generation models and implementations

open-mmlab/mmgeneration

9 papers

1,798

faceonlive/ai-research

9 papers

139

eriklindernoren/PyTorch-GAN

6 papers

15,694

stability-ai/generative-models

5 papers

22,179

See all 8 libraries.

Datasets

Subtasks

Conditional Image Generation

3D-Aware Image Synthesis

Facial Inpainting

Layout-to-Image Generation

ROI-based image generation

Image Generation from Scene Graphs

Pose-Guided Image Generation

User Constrained Thumbnail Generation

Handwritten Word Generation

Chinese Landscape Painting Generation

person reposing

Infinite Image Generation

Multi class one-shot image synthesis

Single class few-shot image synthesis

Latest papers with no code

Most implemented Social Latest No code

Generating Counterfactual Trajectories with Latent Diffusion Models for Concept Discovery

no code yet • 16 Apr 2024

In the first step, CDCT uses a Latent Diffusion Model (LDM) to generate a counterfactual trajectory dataset.

Paper
Add Code

OneActor: Consistent Character Generation via Cluster-Conditioned Guidance

no code yet • 16 Apr 2024

Comprehensive experiments show that our method outperforms a variety of baselines with satisfactory character consistency, superior prompt conformity as well as high image quality.

Paper
Add Code

Adversarial Identity Injection for Semantic Face Image Synthesis

no code yet • 16 Apr 2024

Among all the explored techniques, Semantic Image Synthesis (SIS) methods, whose goal is to generate an image conditioned on a semantic segmentation mask, are the most promising, even though preserving the perceived identity of the input subject is not their main concern.

Paper
Add Code

OmniSSR: Zero-shot Omnidirectional Image Super-Resolution using Stable Diffusion Model

no code yet • 16 Apr 2024

Omnidirectional images (ODIs) are commonly used in real-world visual tasks, and high-resolution ODIs help improve the performance of related visual tasks.

Paper
Add Code

Watermark-embedded Adversarial Examples for Copyright Protection against Diffusion Models

no code yet • 15 Apr 2024

Diffusion Models (DMs) have shown remarkable capabilities in various image-generation tasks.

Paper
Add Code

In-Context Translation: Towards Unifying Image Recognition, Processing, and Generation

no code yet • 15 Apr 2024

Secondly, it standardizes the training of different tasks into a general in-context learning, where "in-context" means the input comprises an example input-output pair of the target task and a query image.

Paper
Add Code

Zero-shot detection of buildings in mobile LiDAR using Language Vision Model

no code yet • 15 Apr 2024

Moreover, constructing LVMs for point clouds is even more challenging due to the requirements for large amounts of data and training time.

Paper
Add Code

Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model

no code yet • 15 Apr 2024

Ctrl-Adapter provides diverse capabilities including image control, video control, video control with sparse frames, multi-condition control, compatibility with different backbones, adaptation to unseen control conditions, and video editing.

Paper
Add Code

EdgeRelight360: Text-Conditioned 360-Degree HDR Image Generation for Real-Time On-Device Video Portrait Relighting

no code yet • 15 Apr 2024

In this paper, we present EdgeRelight360, an approach for real-time video portrait relighting on mobile devices, utilizing text-conditioned generation of 360-degree high dynamic range image (HDRI) maps.

Paper
Add Code

Diffscaler: Enhancing the Generative Prowess of Diffusion Transformers

no code yet • 15 Apr 2024

As these parameters are independent, a single diffusion model with these task-specific parameters can be used to perform multiple tasks simultaneously.

Paper
Add Code

Image Generation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result