Search Results for author: Amit H. Bermano

Found 26 papers, 14 papers with code

Breathing Life Into Sketches Using Text-to-Video Priors

no code implementations21 Nov 2023 Rinon Gal, Yael Vinker, Yuval Alaluf, Amit H. Bermano, Daniel Cohen-Or, Ariel Shamir, Gal Chechik

A sketch is one of the most intuitive and versatile tools humans use to convey their ideas visually.

MAS: Multi-view Ancestral Sampling for 3D motion generation using 2D diffusion

no code implementations23 Oct 2023 Roy Kapon, Guy Tevet, Daniel Cohen-Or, Amit H. Bermano

We introduce Multi-view Ancestral Sampling (MAS), a method for 3D motion generation, using 2D diffusion models that were trained on motions obtained from in-the-wild videos.


State of the Art on Diffusion Models for Visual Computing

no code implementations11 Oct 2023 Ryan Po, Wang Yifan, Vladislav Golyanik, Kfir Aberman, Jonathan T. Barron, Amit H. Bermano, Eric Ryan Chan, Tali Dekel, Aleksander Holynski, Angjoo Kanazawa, C. Karen Liu, Lingjie Liu, Ben Mildenhall, Matthias Nießner, Björn Ommer, Christian Theobalt, Peter Wonka, Gordon Wetzstein

The field of visual computing is rapidly advancing due to the emergence of generative artificial intelligence (AI), which unlocks unprecedented capabilities for the generation, editing, and reconstruction of images, videos, and 3D scenes.

OMG-ATTACK: Self-Supervised On-Manifold Generation of Transferable Evasion Attacks

no code implementations5 Oct 2023 Ofir Bar Tal, Adi Haviv, Amit H. Bermano

Evasion Attacks (EA) are used to test the robustness of trained neural networks by distorting input data to misguide the model into incorrect classifications.

Representation Learning

Performance Conditioning for Diffusion-Based Multi-Instrument Music Synthesis

no code implementations21 Sep 2023 Ben Maman, Johannes Zeitler, Meinard Müller, Amit H. Bermano

Building on state-of-the-art diffusion-based music generative models, we introduce performance conditioning - a simple tool indicating the generative model to synthesize music with style and timbre of specific instruments taken from specific performances.

FAD Information Retrieval +2

Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models

no code implementations13 Jul 2023 Moab Arar, Rinon Gal, Yuval Atzmon, Gal Chechik, Daniel Cohen-Or, Ariel Shamir, Amit H. Bermano

Text-to-image (T2I) personalization allows users to guide the creative image generation process by combining their own visual concepts in natural language prompts.

Image Generation

Neural Projection Mapping Using Reflectance Fields

no code implementations11 Jun 2023 Yotam Erel, Daisuke Iwai, Amit H. Bermano

We introduce a high resolution spatially adaptive light source, or a projector, into a neural reflectance field that allows to both calibrate the projector and photo realistic light editing.

Scene Understanding

Human Motion Diffusion as a Generative Prior

2 code implementations2 Mar 2023 Yonatan Shafir, Guy Tevet, Roy Kapon, Amit H. Bermano

We evaluate the composition methods using an off-the-shelf motion diffusion model, and further compare the results to dedicated models trained for these specific tasks.

Denoising Motion Synthesis

Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models

no code implementations23 Feb 2023 Rinon Gal, Moab Arar, Yuval Atzmon, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or

Specifically, we employ two components: First, an encoder that takes as an input a single image of a target concept from a given domain, e. g. a specific face, and learns to map it into a word-embedding representing the concept.

Novel Concepts

Single Motion Diffusion

1 code implementation12 Feb 2023 Sigal Raab, Inbal Leibovitch, Guy Tevet, Moab Arar, Amit H. Bermano, Daniel Cohen-Or

We harness the power of diffusion models and present a denoising network explicitly designed for the task of learning from a single input motion.

Denoising Style Transfer

OReX: Object Reconstruction from Planar Cross-sections Using Neural Fields

1 code implementation CVPR 2023 Haim Sawdayee, Amir Vaxman, Amit H. Bermano

A modest neural network is trained on the input planes to return an inside/outside estimate for a given 3D coordinate, yielding a powerful prior that induces smoothness and self-similarities.

3D Shape Reconstruction Object Reconstruction

Human Motion Diffusion Model

1 code implementation29 Sep 2022 Guy Tevet, Sigal Raab, Brian Gordon, Yonatan Shafir, Daniel Cohen-Or, Amit H. Bermano

In this paper, we introduce Motion Diffusion Model (MDM), a carefully adapted classifier-free diffusion-based generative model for the human motion domain.

Motion Synthesis

An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion

6 code implementations2 Aug 2022 Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or

Yet, it is unclear how such freedom can be exercised to generate images of specific unique concepts, modify their appearance, or compose them in new roles and novel scenes.

Text-to-Image Generation

Unaligned Supervision For Automatic Music Transcription in The Wild

1 code implementation28 Apr 2022 Ben Maman, Amit H. Bermano

In order to overcome data collection barriers, previous AMT approaches attempt to employ musical scores in the form of a digitized version of the same song or piece.

Information Retrieval Music Information Retrieval +2

MotionCLIP: Exposing Human Motion Generation to CLIP Space

1 code implementation15 Mar 2022 Guy Tevet, Brian Gordon, Amir Hertz, Amit H. Bermano, Daniel Cohen-Or

MotionCLIP gains its unique power by aligning its latent space with that of the Contrastive Language-Image Pre-training (CLIP) model.

Disentanglement Motion Interpolation

State-of-the-Art in the Architecture, Methods and Applications of StyleGAN

no code implementations28 Feb 2022 Amit H. Bermano, Rinon Gal, Yuval Alaluf, Ron Mokady, Yotam Nitzan, Omer Tov, Or Patashnik, Daniel Cohen-Or

Of these, StyleGAN offers a fascinating case study, owing to its remarkable visual quality and an ability to support a large array of downstream tasks.

Image Generation

Self-Conditioned Generative Adversarial Networks for Image Editing

1 code implementation8 Feb 2022 Yunzhe Liu, Rinon Gal, Amit H. Bermano, Baoquan Chen, Daniel Cohen-Or

We compare our models to a wide range of latent editing methods, and show that by alleviating the bias they achieve finer semantic control and better identity preservation through a wider range of transformations.


Stitch it in Time: GAN-Based Facial Editing of Real Videos

1 code implementation20 Jan 2022 Rotem Tzaban, Ron Mokady, Rinon Gal, Amit H. Bermano, Daniel Cohen-Or

The ability of Generative Adversarial Networks to encode rich semantics within their latent space has been widely adopted for facial image editing.

Facial Editing

Leveraging in-domain supervision for unsupervised image-to-image translation tasks via multi-stream generators

no code implementations30 Dec 2021 Dvir Yerushalmi, Dov Danon, Amit H. Bermano

In addition, we propose training a semantic segmentation network along with the translation task, and to leverage this output as a loss term that improves robustness.

Segmentation Semantic Segmentation +2

ClipCap: CLIP Prefix for Image Captioning

4 code implementations18 Nov 2021 Ron Mokady, Amir Hertz, Amit H. Bermano

Image captioning is a fundamental task in vision-language understanding, where the model predicts a textual informative caption to a given input image.

Image Captioning Language Modelling

JOKR: Joint Keypoint Representation for Unsupervised Cross-Domain Motion Retargeting

1 code implementation17 Jun 2021 Ron Mokady, Rotem Tzaban, Sagie Benaim, Amit H. Bermano, Daniel Cohen-Or

To alleviate this problem, we introduce JOKR - a JOint Keypoint Representation that captures the motion common to both the source and target videos, without requiring any object prior or data collection.

Disentanglement motion retargeting

Pivotal Tuning for Latent-based Editing of Real Images

3 code implementations10 Jun 2021 Daniel Roich, Ron Mokady, Amit H. Bermano, Daniel Cohen-Or

The key idea is pivotal tuning - a brief training process that preserves the editing quality of an in-domain latent region, while changing its portrayed identity and appearance.

Facial Editing Image Manipulation

Cannot find the paper you are looking for? You can Submit a new open access paper.