Search Results for author: Yonatan Bitton

Found 17 papers, 11 papers with code

ParallelPARC: A Scalable Pipeline for Generating Natural-Language Analogies

1 code implementation2 Mar 2024 Oren Sultan, Yonatan Bitton, Ron Yosef, Dafna Shahaf

We demonstrate our pipeline and create ProPara-Logy, a dataset of analogies between scientific processes.

Multiple-choice

A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains

no code implementations1 Feb 2024 Alon Jacovi, Yonatan Bitton, Bernd Bohnet, Jonathan Herzig, Or Honovich, Michael Tseng, Michael Collins, Roee Aharoni, Mor Geva

REVEAL includes comprehensive labels for the relevance, attribution to evidence passages, and logical correctness of each reasoning step in a language model's answer, across a variety of datasets and state-of-the-art language models.

Open-Domain Question Answering

Mismatch Quest: Visual and Textual Feedback for Image-Text Misalignment

no code implementations5 Dec 2023 Brian Gordon, Yonatan Bitton, Yonatan Shafir, Roopal Garg, Xi Chen, Dani Lischinski, Daniel Cohen-Or, Idan Szpektor

While existing image-text alignment models reach high quality binary assessments, they fall short of pinpointing the exact source of misalignment.

Explanation Generation Visual Grounding

VideoCon: Robust Video-Language Alignment via Contrast Captions

1 code implementation15 Nov 2023 Hritik Bansal, Yonatan Bitton, Idan Szpektor, Kai-Wei Chang, Aditya Grover

Despite being (pre)trained on a massive amount of data, state-of-the-art video-language alignment models are not robust to semantically-plausible contrastive changes in the video captions.

Language Modelling Large Language Model +5

VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use

1 code implementation12 Aug 2023 Yonatan Bitton, Hritik Bansal, Jack Hessel, Rulin Shao, Wanrong Zhu, Anas Awadalla, Josh Gardner, Rohan Taori, Ludwig Schmidt

These descriptions enable 1) collecting human-verified reference outputs for each instance; and 2) automatic evaluation of candidate multimodal generations using a text-only LLM, aligning with human judgment.

Instruction Following

Read, Look or Listen? What's Needed for Solving a Multimodal Dataset

no code implementations6 Jul 2023 Netta Madvil, Yonatan Bitton, Roy Schwartz

We propose a two-step method to analyze multimodal datasets, which leverages a small seed of human annotation to map each multimodal instance to the modalities required to process it.

Question Answering Speaker Identification +1

Transferring Visual Attributes from Natural Language to Verified Image Generation

no code implementations24 May 2023 Rodrigo Valerio, Joao Bordalo, Michal Yarom, Yonatan Bitton, Idan Szpektor, Joao Magalhaes

In this paper, we propose to strengthen the consistency property of T2I methods in the presence of natural complex language, which often breaks the limits of T2I methods by including non-visual information, and textual elements that require knowledge for accurate generation.

Text-to-Image Generation Visual Question Answering (VQA)

What You See is What You Read? Improving Text-Image Alignment Evaluation

1 code implementation NeurIPS 2023 Michal Yarom, Yonatan Bitton, Soravit Changpinyo, Roee Aharoni, Jonathan Herzig, Oran Lang, Eran Ofek, Idan Szpektor

Automatically determining whether a text and a corresponding image are semantically aligned is a significant challenge for vision-language models, with applications in generative text-to-image and image-to-text tasks.

Question Answering Question Generation +5

q2d: Turning Questions into Dialogs to Teach Models How to Search

no code implementations27 Apr 2023 Yonatan Bitton, Shlomi Cohen-Ganor, Ido Hakimi, Yoad Lewenberg, Roee Aharoni, Enav Weinreb

One of the exciting capabilities of recent language models for dialog is their ability to independently search for relevant information to ground a given dialog response.

Language Modelling Large Language Model +1

IRFL: Image Recognition of Figurative Language

1 code implementation27 Mar 2023 Ron Yosef, Yonatan Bitton, Dafna Shahaf

We release our dataset, benchmark, and code, in hopes of driving the development of models that can better understand figurative language.

Classification Visual Reasoning

VASR: Visual Analogies of Situation Recognition

1 code implementation8 Dec 2022 Yonatan Bitton, Ron Yosef, Eli Strugo, Dafna Shahaf, Roy Schwartz, Gabriel Stanovsky

We leverage situation recognition annotations and the CLIP model to generate a large set of 500k candidate analogies.

Common Sense Reasoning Visual Analogies +1

WinoGAViL: Gamified Association Benchmark to Challenge Vision-and-Language Models

1 code implementation25 Jul 2022 Yonatan Bitton, Nitzan Bitton Guetta, Ron Yosef, Yuval Elovici, Mohit Bansal, Gabriel Stanovsky, Roy Schwartz

While vision-and-language models perform well on tasks such as visual question answering, they struggle when it comes to basic human commonsense reasoning skills.

Common Sense Reasoning General Knowledge +4

Data Efficient Masked Language Modeling for Vision and Language

1 code implementation Findings (EMNLP) 2021 Yonatan Bitton, Gabriel Stanovsky, Michael Elhadad, Roy Schwartz

We investigate a range of alternative masking strategies specific to the cross-modal setting that address these shortcomings, aiming for better fusion of text and image in the learned representation.

Language Modelling Masked Language Modeling +1

Automatic Generation of Contrast Sets from Scene Graphs: Probing the Compositional Consistency of GQA

2 code implementations NAACL 2021 Yonatan Bitton, Gabriel Stanovsky, Roy Schwartz, Michael Elhadad

Recent works have shown that supervised models often exploit data artifacts to achieve good test scores while their performance severely degrades on samples outside their training distribution.

Question Answering Relational Reasoning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.