DALL·E 2

Introduced by Ramesh et al. in Hierarchical Text-Conditional Image Generation with CLIP Latents

DALL·E 2 is a generative text-to-image model made up of two main components: a prior that generates a CLIP image embedding given a text caption, and a decoder that generates an image conditioned on the image embedding.

Source: Hierarchical Text-Conditional Image Generation with CLIP Latents

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Conditional Image Generation	1	25.00%
Image Generation	1	25.00%
Text-to-Image Generation	1	25.00%
Zero-Shot Text-to-Image Generation	1	25.00%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Image Generation Models