no code implementations • 27 Feb 2025 • Edo Kadosh, Nir Goren, Or Patashnik, Daniel Garibi, Daniel Cohen-Or
This process has an inherent tradeoff between reconstruction and editability, limiting the editing of challenging images such as highly-detailed ones.
no code implementations • 20 Feb 2025 • Rameen Abdal, Or Patashnik, Ivan Skorokhodov, Willi Menapace, Aliaksandr Siarohin, Sergey Tulyakov, Daniel Cohen-Or, Kfir Aberman
In this paper, we introduce Set-and-Sequence, a novel framework for personalizing Diffusion Transformers (DiTs)-based generative video models with dynamic concepts.
no code implementations • 2 Jan 2025 • Or Patashnik, Rinon Gal, Daniil Ostashev, Sergey Tulyakov, Kfir Aberman, Daniel Cohen-Or
In this work, we introduce Nested Attention, a novel mechanism that injects a rich and expressive image representation into the model's existing cross-attention layers.
no code implementations • 2 Jan 2025 • Gaurav Parmar, Or Patashnik, Kuan-Chieh Wang, Daniil Ostashev, Srinivasa Narasimhan, Jun-Yan Zhu, Daniel Cohen-Or, Kfir Aberman
A key challenge in this task is to preserve the identity of the objects depicted in the input visual prompts, while also generating diverse compositions across different images.
no code implementations • 12 Dec 2024 • Guocheng Qian, Kuan-Chieh Wang, Or Patashnik, Negin Heravi, Daniil Ostashev, Sergey Tulyakov, Daniel Cohen-Or, Kfir Aberman
Our approach uses a few-to-many identity reconstruction training paradigm, where a limited set of input images is used to reconstruct multiple target images of the same individual in various poses and expressions.
no code implementations • 3 Dec 2024 • Yiftach Edelstein, Or Patashnik, Dana Cohen-Bar, Lihi Zelnik-Manor
In this work, we bridge the quality gap between methods that directly generate 3D representations and ones that reconstruct 3D objects from multi-view images.
1 code implementation • 21 Nov 2024 • Omri Avrahami, Or Patashnik, Ohad Fried, Egor Nemchinov, Kfir Aberman, Dani Lischinski, Daniel Cohen-Or
The main challenge is that, unlike the UNet-based models, DiT lacks a coarse-to-fine synthesis structure, making it unclear in which layers to perform the injection.
no code implementations • 23 Sep 2024 • Yehonathan Litman, Or Patashnik, Kangle Deng, Aviral Agrawal, Rushikesh Zawar, Fernando de la Torre, Shubham Tulsiani
This model is trained on albedo, material, and relit image data derived from a curated dataset of approximately ~12K artist-designed synthetic Blender objects called BlenderVault.
no code implementations • 4 Apr 2024 • Rinon Gal, Or Lichter, Elad Richardson, Or Patashnik, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or
In this work, we explore the potential of using such shortcut-mechanisms to guide the personalization of text-to-image models to specific facial identities.
no code implementations • 25 Mar 2024 • Omer Dahary, Or Patashnik, Kfir Aberman, Daniel Cohen-Or
Text-to-image diffusion models have an unprecedented ability to generate diverse and high-quality images.
1 code implementation • 21 Mar 2024 • Daniel Garibi, Or Patashnik, Andrey Voynov, Hadar Averbuch-Elor, Daniel Cohen-Or
However, applying these methods to real images necessitates the inversion of the images into the domain of the pretrained diffusion model.
no code implementations • 22 Feb 2024 • Or Patashnik, Rinon Gal, Daniel Cohen-Or, Jun-Yan Zhu, Fernando de la Torre
In this work, we focus on spatial control-based geometric manipulations and introduce a method to consolidate the editing process across various views.
no code implementations • CVPR 2024 • Mehdi Safaee, Aryan Mikaeili, Or Patashnik, Daniel Cohen-Or, Ali Mahdavi-Amiri
This paper addresses the challenge of learning a local visual pattern of an object from one image, and generating images depicting objects with that pattern.
no code implementations • 6 Nov 2023 • Yuval Alaluf, Daniel Garibi, Or Patashnik, Hadar Averbuch-Elor, Daniel Cohen-Or
Recent advancements in text-to-image generative models have demonstrated a remarkable ability to capture a deep semantic understanding of images.
no code implementations • 26 Oct 2023 • Oren Katzir, Or Patashnik, Daniel Cohen-Or, Dani Lischinski
Score Distillation Sampling (SDS) has emerged as the de facto approach for text-to-content generation in non-image domains.
1 code implementation • ICCV 2023 • Or Patashnik, Daniel Garibi, Idan Azuri, Hadar Averbuch-Elor, Daniel Cohen-Or
In this paper, we present a technique to generate a collection of images that depicts variations in the shape of a specific object, enabling an object-level shape exploration process.
2 code implementations • CVPR 2023 • Gal Metzer, Elad Richardson, Or Patashnik, Raja Giryes, Daniel Cohen-Or
This unique combination of text and shape guidance allows for increased control over the generation process.
Ranked #3 on
Text to 3D
on T$^3$Bench
8 code implementations • 2 Aug 2022 • Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or
Yet, it is unclear how such freedom can be exercised to generate images of specific unique concepts, modify their appearance, or compose them in new roles and novel scenes.
Ranked #7 on
Personalized Image Generation
on DreamBooth
no code implementations • 28 Feb 2022 • Amit H. Bermano, Rinon Gal, Yuval Alaluf, Ron Mokady, Yotam Nitzan, Omer Tov, Or Patashnik, Daniel Cohen-Or
Of these, StyleGAN offers a fascinating case study, owing to its remarkable visual quality and an ability to support a large array of downstream tasks.
no code implementations • 6 Feb 2022 • Xianxu Hou, Linlin Shen, Or Patashnik, Daniel Cohen-Or, Hui Huang
In this paper, we build on the StyleGAN generator, and present a method that explicitly encourages face manipulation to focus on the intended regions by incorporating learned attention maps.
1 code implementation • 31 Jan 2022 • Yuval Alaluf, Or Patashnik, Zongze Wu, Asif Zamir, Eli Shechtman, Dani Lischinski, Daniel Cohen-Or
In particular, we demonstrate that while StyleGAN3 can be trained on unaligned data, one can still use aligned data for training, without hindering the ability to generate unaligned imagery.
3 code implementations • 2 Aug 2021 • Rinon Gal, Or Patashnik, Haggai Maron, Gal Chechik, Daniel Cohen-Or
Can a generative model be trained to produce images from a specific domain, guided by a text prompt only, without seeing any image?
1 code implementation • 15 Jul 2021 • Omer Kafri, Or Patashnik, Yuval Alaluf, Daniel Cohen-Or
Inserting the resulting style code into a pre-trained StyleGAN generator results in a single harmonized image in which each semantic region is controlled by one of the input latent codes.
2 code implementations • ICCV 2021 • Yuval Alaluf, Or Patashnik, Daniel Cohen-Or
Instead of directly predicting the latent code of a given real image using a single pass, the encoder is tasked with predicting a residual with respect to the current estimate of the inverted latent code in a self-correcting manner.
5 code implementations • ICCV 2021 • Or Patashnik, Zongze Wu, Eli Shechtman, Daniel Cohen-Or, Dani Lischinski
Inspired by the ability of StyleGAN to generate highly realistic images in a variety of domains, much recent work has focused on understanding how to use the latent spaces of StyleGAN to manipulate generated and real images.
8 code implementations • 4 Feb 2021 • Omer Tov, Yuval Alaluf, Yotam Nitzan, Or Patashnik, Daniel Cohen-Or
We then suggest two principles for designing encoders in a manner that allows one to control the proximity of the inversions to regions that StyleGAN was originally trained on.
2 code implementations • 4 Feb 2021 • Yuval Alaluf, Or Patashnik, Daniel Cohen-Or
In this formulation, our method approaches the continuous aging process as a regression task between the input age and desired target age, providing fine-grained control over the generated image.
1 code implementation • 5 Oct 2020 • Or Patashnik, Dov Danon, Hao Zhang, Daniel Cohen-Or
State-of-the-art image-to-image translation methods tend to struggle in an imbalanced domain setting, where one image domain lacks richness and diversity.
10 code implementations • CVPR 2021 • Elad Richardson, Yuval Alaluf, Or Patashnik, Yotam Nitzan, Yaniv Azar, Stav Shapiro, Daniel Cohen-Or
We present a generic image-to-image translation framework, pixel2style2pixel (pSp).