Zero-Shot Text-to-Image Generation

11 papers with code • 0 benchmarks • 0 datasets


Use these libraries to find Zero-Shot Text-to-Image Generation models and implementations

Most implemented papers

Zero-Shot Text-to-Image Generation

openai/DALL-E 24 Feb 2021

Text-to-image generation has traditionally focused on finding better modeling assumptions for training on a fixed dataset.

Hierarchical Text-Conditional Image Generation with CLIP Latents

lucidrains/DALLE2-pytorch 13 Apr 2022

Contrastive models like CLIP have been shown to learn robust representations of images that capture both semantics and style.

CogView: Mastering Text-to-Image Generation via Transformers

THUDM/CogView NeurIPS 2021

Text-to-Image generation in the general domain has long been an open problem, which requires both a powerful generative model and cross-modal understanding.

LAFITE: Towards Language-Free Training for Text-to-Image Generation

drboog/Lafite 27 Nov 2021

One of the major challenges in training text-to-image generation models is the need of a large number of high-quality image-text pairs.

GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models

openai/glide-text2im 20 Dec 2021

Diffusion models have recently been shown to generate high-quality synthetic images, especially when paired with a guidance technique to trade off diversity for fidelity.

L-Verse: Bidirectional Generation Between Image and Text

tgisaturday/L-Verse CVPR 2022

Unlike other models, BiART can distinguish between image (or text) as a conditional reference and a generation target.

Blended Diffusion for Text-driven Editing of Natural Images

omriav/blended-diffusion CVPR 2022

Natural language offers a highly intuitive interface for image editing.

FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization

gnobitab/fusedream 2 Dec 2021

We approach text-to-image generation by combining the power of the retrained CLIP representation with an off-the-shelf image generator (GANs), optimizing in the latent space of GAN to find images that achieve maximum CLIP score with the given input text.

Blended Latent Diffusion

omriav/blended-latent-diffusion 6 Jun 2022

Our solution leverages a recent text-to-image Latent Diffusion Model (LDM), which speeds up diffusion by operating in a lower-dimensional latent space.

Shifted Diffusion for Text-to-image Generation

drboog/Shifted_Diffusion CVPR 2023

Unlike the baseline diffusion model used in DALL-E 2, our method seamlessly encodes prior knowledge of the pre-trained CLIP model in its diffusion process by designing a new initialization distribution and a new transition step of the diffusion.