Image Generation Models

DALL·E 2 is a generative text-to-image model made up of two main components: a prior that generates a CLIP image embedding given a text caption, and a decoder that generates an image conditioned on the image embedding.

Source: Hierarchical Text-Conditional Image Generation with CLIP Latents


Paper Code Results Date Stars


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign