Guided Language to Image Diffusion for Generation and Editing

Introduced by Nichol et al. in GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

GLIDE is a generative model based on text-guided diffusion models for more photorealistic image generation. Guided diffusion is applied to text-conditional image synthesis and the model is able to handle free-form prompts. The diffusion model uses a text encoder to condition on natural language descriptions. The model is provided with editing capabilities in addition to zero-shot generation, allowing for iterative improvement of model samples to match more complex prompts. The model is fine-tuned to perform image inpainting.

Source: GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Image Generation	9	39.13%
Text-to-Image Generation	3	13.04%
Denoising	2	8.70%
Trajectory Planning	1	4.35%
Depth Estimation	1	4.35%
Semantic Segmentation	1	4.35%
Style Transfer	1	4.35%
Zero-Shot Learning	1	4.35%
Fake Image Detection	1	4.35%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Multi-Modal Methods

Image Generation Models