Personalized Image Generation

31 papers with code • 1 benchmarks • 1 datasets

Utilizes single or multiple images that contain the same subject or style, along with text prompt, to generate images that contain that subject as well as match the textual description. Includes finetuning-based methods (e.g. DreamBooth, Textual Inversion) as well as encoder-based methods (e.g. E4T, ELITE, and IP-Adapter, etc.).

Libraries

Use these libraries to find Personalized Image Generation models and implementations
2 papers
9,460
2 papers
7,388

Datasets


Most implemented papers

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

PaddlePaddle/PaddleNLP CVPR 2023

Once the subject is embedded in the output domain of the model, the unique identifier can be used to synthesize novel photorealistic images of the subject contextualized in different scenes.

An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion

rinongal/textual_inversion 2 Aug 2022

Yet, it is unclear how such freedom can be exercised to generate images of specific unique concepts, modify their appearance, or compose them in new roles and novel scenes.

IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models

tencent-ailab/ip-adapter 13 Aug 2023

Despite the simplicity of our method, an IP-Adapter with only 22M parameters can achieve comparable or even better performance to a fully fine-tuned image prompt model.

Personalized Image Generation for Color Vision Deficiency Population

jiangshuyi0v0/cvd-gan ICCV 2023

Approximately, 350 million people, a proportion of 8%, suffer from color vision deficiency (CVD).

FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention

mit-han-lab/fastcomposer 17 May 2023

FastComposer proposes delayed subject conditioning in the denoising step to maintain both identity and editability in subject-driven image generation.

BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing

salesforce/lavis NeurIPS 2023

Then we design a subject representation learning task which enables a diffusion model to leverage such visual representation and generates new subject renditions.

Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning

OPPO-Mente-Lab/Subject-Diffusion 21 Jul 2023

In this paper, we propose Subject-Diffusion, a novel open-domain personalized image generation model that, in addition to not requiring test-time fine-tuning, also only requires a single reference image to support personalized generation of single- or multi-subject in any domain.

FaceChain: A Playground for Human-centric Artificial Intelligence Generated Content

modelscope/facechain 28 Aug 2023

In this paper, we present FaceChain, a personalized portrait generation framework that combines a series of customized image-generation model and a rich set of face-related perceptual understanding models (\eg, face detection, deep face embedding extraction, and facial attribute recognition), to tackle aforementioned challenges and to generate truthful personalized portraits, with only a handful of portrait images as input.

When StyleGAN Meets Stable Diffusion: a $\mathscr{W}_+$ Adapter for Personalized Image Generation

csxmli2016/w-plus-adapter 29 Nov 2023

Text-to-image diffusion models have remarkably excelled in producing diverse, high-quality, and photo-realistic images.

Generative Multimodal Models are In-Context Learners

baaivision/emu CVPR 2024

The human ability to easily solve multimodal tasks in context (i. e., with only a few demonstrations or simple instructions), is what current multimodal systems have largely struggled to imitate.