Image Generation
1922 papers with code • 85 benchmarks • 67 datasets
Image Generation (synthesis) is the task of generating new images from an existing dataset.
- Unconditional generation refers to generating samples unconditionally from the dataset, i.e. $p(y)$
- Conditional image generation (subtask) refers to generating samples conditionally from the dataset, based on a label, i.e. $p(y|x)$.
In this section, you can find state-of-the-art leaderboards for unconditional generation. For conditional generation, and other types of image generations, refer to the subtasks.
( Image credit: StyleGAN )
Libraries
Use these libraries to find Image Generation models and implementationsDatasets
Subtasks
- Image-to-Image Translation
- Image Inpainting
- Text-to-Image Generation
- Conditional Image Generation
- Conditional Image Generation
- Face Generation
- Image Harmonization
- Pose Transfer
- 3D-Aware Image Synthesis
- Facial Inpainting
- Layout-to-Image Generation
- ROI-based image generation
- Image Generation from Scene Graphs
- Pose-Guided Image Generation
- User Constrained Thumbnail Generation
- Handwritten Word Generation
- Chinese Landscape Painting Generation
- person reposing
- Infinite Image Generation
- Multi class one-shot image synthesis
- Single class few-shot image synthesis
Latest papers
Attention Calibration for Disentangled Text-to-Image Personalization
However, an intriguing problem persists: Is it possible to capture multiple, novel concepts from one single reference image?
Ship in Sight: Diffusion Models for Ship-Image Super Resolution
In this context, our method explores in depth the problem of ship image super resolution, which is crucial for coastal and port surveillance.
DiffusionFace: Towards a Comprehensive Dataset for Diffusion-Based Face Forgery Analysis
The rapid progress in deep learning has given rise to hyper-realistic facial forgery methods, leading to concerns related to misinformation and security risks.
Self-Rectifying Diffusion Sampling with Perturbed-Attention Guidance
These techniques are often not applicable in unconditional generation or in various downstream tasks such as image restoration.
SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions
Recent advancements in diffusion models have positioned them at the forefront of image generation.
Multi-Scale Texture Loss for CT denoising with GANs
To grasp highly complex and non-linear textural relationships in the training process, this work presents a loss function that leverages the intrinsic multi-scale nature of the Gray-Level-Co-occurrence Matrix (GLCM).
Long-CLIP: Unlocking the Long-Text Capability of CLIP
Contrastive Language-Image Pre-training (CLIP) has been the cornerstone for zero-shot classification, text-image retrieval, and text-image generation by aligning image and text modalities.
Generative Active Learning for Image Synthesis Personalization
The primary challenge in conducting active learning on generative models lies in the open-ended nature of querying, which differs from the closed form of querying in discriminative models that typically target a single concept.
Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models
This approach limits the generation of segmentation masks derived from word tokens not contained in the text prompt.
Diversity-aware Channel Pruning for StyleGAN Compression
Specifically, by assessing channel importance based on their sensitivities to latent vector perturbations, our method enhances the diversity of samples in the compressed model.