GLIDE is a generative model based on text-guided diffusion models for more photorealistic image generation. Guided diffusion is applied to text-conditional image synthesis and the model is able to handle free-form prompts. The diffusion model uses a text encoder to condition on natural language descriptions. The model is provided with editing capabilities in addition to zero-shot generation, allowing for iterative improvement of model samples to match more complex prompts. The model is fine-tuned to perform image inpainting.
Source: GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion ModelsPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Image Generation | 9 | 39.13% |
Text-to-Image Generation | 3 | 13.04% |
Denoising | 2 | 8.70% |
Trajectory Planning | 1 | 4.35% |
Depth Estimation | 1 | 4.35% |
Semantic Segmentation | 1 | 4.35% |
Style Transfer | 1 | 4.35% |
Zero-Shot Learning | 1 | 4.35% |
Fake Image Detection | 1 | 4.35% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |