GLIDE is a generative model based on text-guided diffusion models for more photorealistic image generation. Guided diffusion is applied to text-conditional image synthesis and the model is able to handle free-form prompts. The diffusion model uses a text encoder to condition on natural language descriptions. The model is provided with editing capabilities in addition to zero-shot generation, allowing for iterative improvement of model samples to match more complex prompts. The model is fine-tuned to perform image inpainting.
Source: GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion ModelsPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Image Generation | 7 | 35.00% |
Text-to-Image Generation | 3 | 15.00% |
Text to image generation | 2 | 10.00% |
Semantic Segmentation | 1 | 5.00% |
Style Transfer | 1 | 5.00% |
Zero-Shot Learning | 1 | 5.00% |
Fake Image Detection | 1 | 5.00% |
Denoising | 1 | 5.00% |
text-guided-image-editing | 1 | 5.00% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |