Search Results for author: Assaf Arbelle

Found 24 papers, 15 papers with code

Teaching VLMs to Localize Specific Objects from In-context Examples

1 code implementation20 Nov 2024 Sivan Doveh, Nimrod Shabtay, Wei Lin, Eli Schwartz, Hilde Kuehne, Raja Giryes, Rogerio Feris, Leonid Karlinsky, James Glass, Assaf Arbelle, Shimon Ullman, M. Jehanzeb Mirza

In this work, we focus on the task of few-shot personalized localization, where a model is given a small set of annotated images (in-context examples) -- each with a category label and bounding box -- and is tasked with localizing the same object type in a query image.

Object Question Answering +3

LiveXiv -- A Multi-Modal Live Benchmark Based on Arxiv Papers Content

1 code implementation14 Oct 2024 Nimrod Shabtay, Felipe Maia Polo, Sivan Doveh, Wei Lin, M. Jehanzeb Mirza, Leshem Chosen, Mikhail Yurochkin, Yuekai Sun, Assaf Arbelle, Leonid Karlinsky, Raja Giryes

Moreover, we introduce an efficient evaluation approach that estimates the performance of all models on the evolving benchmark using evaluations of only a subset of models.

Visual Question Answering (VQA) World Knowledge

Augmenting In-Context-Learning in LLMs via Automatic Data Labeling and Refinement

no code implementations14 Oct 2024 Joseph Shtok, Amit Alfassy, Foad Abo Dahood, Eliyahu Schwartz, Sivan Doveh, Assaf Arbelle

In this work, we propose Automatic Data Labeling and Refinement (ADLR), a method to automatically generate and filter demonstrations which include the above intermediate steps, starting from a small seed of manually crafted examples.

In-Context Learning Mathematical Reasoning

Multimodal Task Vectors Enable Many-Shot Multimodal In-Context Learning

1 code implementation21 Jun 2024 Brandon Huang, Chancharik Mitra, Assaf Arbelle, Leonid Karlinsky, Trevor Darrell, Roei Herzig

The recent success of interleaved Large Multimodal Models (LMMs) in few-shot learning suggests that in-context learning (ICL) with many examples can be promising for learning new tasks.

Few-Shot Learning In-Context Learning

Towards Multimodal In-Context Learning for Vision & Language Models

no code implementations19 Mar 2024 Sivan Doveh, Shaked Perek, M. Jehanzeb Mirza, Wei Lin, Amit Alfassy, Assaf Arbelle, Shimon Ullman, Leonid Karlinsky

State-of-the-art Vision-Language Models (VLMs) ground the vision and the language modality primarily via projecting the vision tokens from the encoder to language-like tokens, which are directly fed to the Large Language Model (LLM) decoder.

Image Captioning In-Context Learning +2

Incorporating Structured Representations into Pretrained Vision & Language Models Using Scene Graphs

no code implementations10 May 2023 Roei Herzig, Alon Mendelson, Leonid Karlinsky, Assaf Arbelle, Rogerio Feris, Trevor Darrell, Amir Globerson

For the visual side, we incorporate a special "SG Component" in the image transformer trained to predict SG information, while for the textual side, we utilize SGs to generate fine-grained captions that highlight different compositional aspects of the scene.

Scene Understanding Visual Reasoning

MAEDAY: MAE for few and zero shot AnomalY-Detection

1 code implementation25 Nov 2022 Eli Schwartz, Assaf Arbelle, Leonid Karlinsky, Sivan Harary, Florian Scheidegger, Sivan Doveh, Raja Giryes

We propose using Masked Auto-Encoder (MAE), a transformer model self-supervisedly trained on image inpainting, for anomaly detection (AD).

Anomaly Detection Image Inpainting +4

ConStruct-VL: Data-Free Continual Structured VL Concepts Learning

1 code implementation CVPR 2023 James Seale Smith, Paola Cascante-Bonilla, Assaf Arbelle, Donghyun Kim, Rameswar Panda, David Cox, Diyi Yang, Zsolt Kira, Rogerio Feris, Leonid Karlinsky

This leads to reasoning mistakes, which need to be corrected as they occur by teaching VL models the missing SVLC skills; often this must be done using private data where the issue was found, which naturally leads to a data-free continual (no task-id) VL learning setting.

FETA: Towards Specializing Foundation Models for Expert Task Applications

1 code implementation8 Sep 2022 Amit Alfassy, Assaf Arbelle, Oshri Halimi, Sivan Harary, Roei Herzig, Eli Schwartz, Rameswar Panda, Michele Dolfi, Christoph Auer, Kate Saenko, PeterW. J. Staar, Rogerio Feris, Leonid Karlinsky

However, as we show in this paper, FMs still have poor out-of-the-box performance on expert tasks (e. g. retrieval of car manuals technical illustrations from language queries), data for which is either unseen or belonging to a long-tail part of the data distribution of the huge datasets used for FM pre-training.

Domain Generalization Image Retrieval +7

Unsupervised Domain Generalization by Learning a Bridge Across Domains

1 code implementation CVPR 2022 Sivan Harary, Eli Schwartz, Assaf Arbelle, Peter Staar, Shady Abu-Hussein, Elad Amrani, Roei Herzig, Amit Alfassy, Raja Giryes, Hilde Kuehne, Dina Katabi, Kate Saenko, Rogerio Feris, Leonid Karlinsky

The ability to generalize learned representations across significantly different visual domains, such as between real photos, clipart, paintings, and sketches, is a fundamental capacity of the human visual system.

Domain Generalization Self-Supervised Learning

DeepHist: Differentiable Joint and Color Histogram Layers for Image-to-Image Translation

1 code implementation6 May 2020 Mor Avi-Aharon, Assaf Arbelle, Tammy Riklin Raviv

Promising results are shown for the tasks of color transfer, image colorization and edges $\rightarrow$ photo, where the color distribution of the output image is controlled.

Colorization Image Colorization +2

Hue-Net: Intensity-based Image-to-Image Translation with Differentiable Histogram Loss Functions

no code implementations12 Dec 2019 Mor Avi-Aharon, Assaf Arbelle, Tammy Riklin Raviv

To enforce color-free similarity between the source and the output images, we define a semantic-based loss by a differentiable approximation of the MI of these images.

Image-to-Image Translation Translation

Microscopy Cell Segmentation via Convolutional LSTM Networks

3 code implementations29 May 2018 Assaf Arbelle, Tammy Riklin Raviv

Live cell microscopy sequences exhibit complex spatial structures and complicated temporal behaviour, making their analysis a challenging task.

Cell Segmentation Cell Tracking +1

Microscopy Cell Segmentation via Adversarial Neural Networks

1 code implementation18 Sep 2017 Assaf Arbelle, Tammy Riklin Raviv

We present a novel method for cell segmentation in microscopy images which is inspired by the Generative Adversarial Neural Network (GAN) approach.

Cell Segmentation Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.