Search Results for author: Hadar Averbuch-Elor

Found 33 papers, 20 papers with code

ReNoise: Real Image Inversion Through Iterative Noising

no code implementations21 Mar 2024 Daniel Garibi, Or Patashnik, Andrey Voynov, Hadar Averbuch-Elor, Daniel Cohen-Or

However, applying these methods to real images necessitates the inversion of the images into the domain of the pretrained diffusion model.

Denoising Image Manipulation

ICC: Quantifying Image Caption Concreteness for Multimodal Dataset Curation

no code implementations2 Mar 2024 Moran Yanuka, Morris Alper, Hadar Averbuch-Elor, Raja Giryes

Web-scale training on paired text-image data is becoming increasingly central to multimodal learning, but is challenged by the highly noisy nature of datasets in the wild.

Sentence

Mitigating Open-Vocabulary Caption Hallucinations

1 code implementation6 Dec 2023 Assaf Ben-Kish, Moran Yanuka, Morris Alper, Raja Giryes, Hadar Averbuch-Elor

While recent years have seen rapid progress in image-conditioned text generation, image captioning still suffers from the fundamental issue of hallucinations, namely, the generation of spurious details that cannot be inferred from the given image.

Hallucination Image Captioning +2

SPiC-E : Structural Priors in 3D Diffusion Models using Cross-Entity Attention

no code implementations29 Nov 2023 Etai Sella, Gal Fiebelman, Noam Atia, Hadar Averbuch-Elor

We are witnessing rapid progress in automatically generating and manipulating 3D assets due to the availability of pretrained text-image diffusion models.

Denoising

Cross-Image Attention for Zero-Shot Appearance Transfer

no code implementations6 Nov 2023 Yuval Alaluf, Daniel Garibi, Or Patashnik, Hadar Averbuch-Elor, Daniel Cohen-Or

Recent advancements in text-to-image generative models have demonstrated a remarkable ability to capture a deep semantic understanding of images.

Denoising

Kiki or Bouba? Sound Symbolism in Vision-and-Language Models

no code implementations NeurIPS 2023 Morris Alper, Hadar Averbuch-Elor

Although the mapping between sound and meaning in human language is assumed to be largely arbitrary, research in cognitive science has shown that there are non-trivial correlations between particular sounds and meanings across languages and demographic groups, a phenomenon known as sound symbolism.

Knowledge Probing

A Joint Study of Phrase Grounding and Task Performance in Vision and Language Models

1 code implementation6 Sep 2023 Noriyuki Kojima, Hadar Averbuch-Elor, Yoav Artzi

Key to tasks that require reasoning about natural language in visual contexts is grounding words and phrases to image regions.

Phrase Grounding

Doppelgangers: Learning to Disambiguate Images of Similar Structures

1 code implementation ICCV 2023 Ruojin Cai, Joseph Tung, Qianqian Wang, Hadar Averbuch-Elor, Bharath Hariharan, Noah Snavely

Our evaluation shows that our method can distinguish illusory matches in difficult cases, and can be integrated into SfM pipelines to produce correct, disambiguated 3D reconstructions.

3D Reconstruction Binary Classification

Neural Scene Chronology

1 code implementation CVPR 2023 Haotong Lin, Qianqian Wang, Ruojin Cai, Sida Peng, Hadar Averbuch-Elor, Xiaowei Zhou, Noah Snavely

Specifically, we represent the scene as a space-time radiance field with a per-image illumination embedding, where temporally-varying scene changes are encoded using a set of learned step functions.

Learning Human-Human Interactions in Images from Weak Textual Supervision

no code implementations ICCV 2023 Morris Alper, Hadar Averbuch-Elor

We show that the pseudo-labels produced by this procedure can be used to train a captioning model to effectively understand human-human interactions in images, as measured by a variety of metrics that measure textual and semantic faithfulness and factual groundedness of our predictions.

Image Captioning Knowledge Distillation +2

Is BERT Blind? Exploring the Effect of Vision-and-Language Pretraining on Visual Language Understanding

1 code implementation CVPR 2023 Morris Alper, Michael Fiman, Hadar Averbuch-Elor

We show that SOTA multimodally trained text encoders outperform unimodally trained text encoders on the VLU tasks while being underperformed by them on the NLU tasks, lending new context to previously mixed results regarding the NLU capabilities of multimodal models.

Knowledge Probing Language Modelling +2

Vox-E: Text-guided Voxel Editing of 3D Objects

1 code implementation ICCV 2023 Etai Sella, Gal Fiebelman, Peter Hedman, Hadar Averbuch-Elor

Our method takes oriented 2D images of a 3D object as input and learns a grid-based volumetric representation of it.

3D Object Editing Text to 3D

Localizing Object-level Shape Variations with Text-to-Image Diffusion Models

1 code implementation ICCV 2023 Or Patashnik, Daniel Garibi, Idan Azuri, Hadar Averbuch-Elor, Daniel Cohen-Or

In this paper, we present a technique to generate a collection of images that depicts variations in the shape of a specific object, enabling an object-level shape exploration process.

Denoising Object +1

Neural 3D Reconstruction in the Wild

1 code implementation25 May 2022 Jiaming Sun, Xi Chen, Qianqian Wang, Zhengqi Li, Hadar Averbuch-Elor, Xiaowei Zhou, Noah Snavely

We are witnessing an explosion of neural implicit representations in computer vision and graphics.

3D Reconstruction Surface Reconstruction

To SMOTE, or not to SMOTE?

1 code implementation21 Jan 2022 Yotam Elor, Hadar Averbuch-Elor

Balancing the data before training a classifier is a popular technique to address the challenges of imbalanced binary classification in tabular data.

Binary Classification

Who's Waldo? Linking People Across Text and Images

1 code implementation ICCV 2021 Claire Yuqing Cui, Apoorv Khandelwal, Yoav Artzi, Noah Snavely, Hadar Averbuch-Elor

We present a task and benchmark dataset for person-centric visual grounding, the problem of linking between people named in a caption and people pictured in an image.

 Ranked #1 on Person-centric Visual Grounding on Who’s Waldo (using extra training data)

Person-centric Visual Grounding

Towers of Babel: Combining Images, Language, and 3D Geometry for Learning Multimodal Vision

1 code implementation ICCV 2021 Xiaoshi Wu, Hadar Averbuch-Elor, Jin Sun, Noah Snavely

The abundance and richness of Internet photos of landmarks and cities has led to significant progress in 3D vision over the past two decades, including automated 3D reconstructions of the world's landmarks from tourist photos.

Descriptive Image Captioning +1

Extreme Rotation Estimation using Dense Correlation Volumes

1 code implementation CVPR 2021 Ruojin Cai, Bharath Hariharan, Noah Snavely, Hadar Averbuch-Elor

We present a technique for estimating the relative 3D rotation of an RGB image pair in an extreme setting, where the images have little or no overlap.

Feature Correlation

Learning Multimodal Affinities for Textual Editing in Images

no code implementations18 Mar 2021 Or Perel, Oron Anschel, Omri Ben-Eliezer, Shai Mazor, Hadar Averbuch-Elor

Nowadays, as cameras are rapidly adopted in our daily routine, images of documents are becoming both abundant and prevalent.

An Ethical Highlighter for People-Centric Dataset Creation

no code implementations27 Nov 2020 Margot Hanley, Apoorv Khandelwal, Hadar Averbuch-Elor, Noah Snavely, Helen Nissenbaum

Important ethical concerns arising from computer vision datasets of people have been receiving significant attention, and a number of datasets have been withdrawn as a result.

Hidden Footprints: Learning Contextual Walkability from 3D Human Trails

no code implementations ECCV 2020 Jin Sun, Hadar Averbuch-Elor, Qianqian Wang, Noah Snavely

Predicting where people can walk in a scene is important for many tasks, including autonomous driving systems and human behavior analysis.

Autonomous Driving valid

Co-occurrence Based Texture Synthesis

1 code implementation17 May 2020 Anna Darzi, Itai Lang, Ashutosh Taklikar, Hadar Averbuch-Elor, Shai Avidan

As image generation techniques mature, there is a growing interest in explainable representations that are easy to understand and intuitive to manipulate.

Generative Adversarial Network Image Generation +3

ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation

3 code implementations CVPR 2020 Sharon Fogel, Hadar Averbuch-Elor, Sarel Cohen, Shai Mazor, Roee Litman

This is especially true for handwritten text recognition (HTR), where each author has a unique style, unlike printed text, where the variation is smaller by design.

Domain Adaptation Handwriting generation +5

READ: Recursive Autoencoders for Document Layout Generation

no code implementations1 Sep 2019 Akshay Gadi Patil, Omri Ben-Eliezer, Or Perel, Hadar Averbuch-Elor

Creating large varieties of plausible document layouts can be a tedious task, requiring numerous constraints to be satisfied, including local ones relating different semantic elements and global constraints on the general appearance and spacing.

Implicit Pairs for Boosting Unpaired Image-to-Image Translation

no code implementations15 Apr 2019 Yiftach Ginger, Dov Danon, Hadar Averbuch-Elor, Daniel Cohen-Or

As a result, in recent years more attention has been given to techniques that learn the mapping from unpaired sets.

Image-to-Image Translation Translation

Clustering-driven Deep Embedding with Pairwise Constraints

1 code implementation22 Mar 2018 Sharon Fogel, Hadar Averbuch-Elor, Jacov Goldberger, Daniel Cohen-Or

In this paper, we depart from centroid-based models and suggest a new framework, called Clustering-driven deep embedding with PAirwise Constraints (CPAC), for non-parametric clustering using a neural network.

Clustering

Co-segmentation for Space-Time Co-located Collections

no code implementations31 Jan 2017 Hadar Averbuch-Elor, Johannes Kopf, Tamir Hazan, Daniel Cohen-Or

Thus, to disambiguate what the common foreground object is, we introduce a weakly-supervised technique, where we assume only a small seed, given in the form of a single segmented image.

Object Segmentation

Border-Peeling Clustering

1 code implementation14 Dec 2016 Hadar Averbuch-Elor, Nadav Bar, Daniel Cohen-Or

In this paper, we present a novel non-parametric clustering technique.

Clustering

Spherical Embedding of Inlier Silhouette Dissimilarities

no code implementations CVPR 2015 Etai Littwin, Hadar Averbuch-Elor, Daniel Cohen-Or

In this paper, we introduce a spherical embedding technique to position a given set of silhouettes of an object as observed from a set of cameras arbitrarily positioned around the object.

Position

Cannot find the paper you are looking for? You can Submit a new open access paper.