no code implementations • 10 May 2016 • Yuval Pinter, Roi Reichart, Idan Szpektor
A description and annotation guidelines for the Yahoo Webscope release of Query Treebank, Version 1. 0, May 2016.
2 code implementations • NAACL 2019 • Mor Geva, Eric Malmi, Idan Szpektor, Jonathan Berant
We author a set of rules for identifying a diverse set of discourse phenomena in raw text, and decomposing the text into two independent sentences.
no code implementations • 17 Mar 2019 • Ido Cohn, Itay Laish, Genady Beryozkin, Gang Li, Izhak Shafran, Idan Szpektor, Tzvika Hartman, Avinatan Hassidim, Yossi Matias
To this end, we define the task of audio de-ID, in which audio spans with entity mentions should be detected.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
no code implementations • ACL 2019 • Genady Beryozkin, Yoel Drori, Oren Gilon, Tzvika Hartman, Idan Szpektor
We study a variant of domain adaptation for named-entity recognition where multiple, heterogeneously tagged training sets are available.
no code implementations • NAACL 2019 • Ido Cohn, Itay Laish, Genady Beryozkin, Gang Li, Izhak Shafran, Idan Szpektor, Tzvika Hartman, Avinatan Hassidim, Yossi Matias
To this end, we define the task of audio de-ID, in which audio spans with entity mentions should be detected.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +5
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Eyal Ben-David, Orgad Keller, Eric Malmi, Idan Szpektor, Roi Reichart
Sentence fusion is the task of joining related sentences into coherent text.
no code implementations • 5 Apr 2021 • Avishai Zagoury, Einat Minkov, Idan Szpektor, William W. Cohen
Here we study using such LMs to fill in entities in human-authored comparative questions, like ``Which country is older, India or ______?''
1 code implementation • 16 Apr 2021 • Or Honovich, Leshem Choshen, Roee Aharoni, Ella Neeman, Idan Szpektor, Omri Abend
Neural knowledge-grounded generative models for dialogue often produce content that is factually inconsistent with the knowledge they rely on, making them unreliable and limiting their applicability.
1 code implementation • NAACL 2022 • Or Honovich, Roee Aharoni, Jonathan Herzig, Hagai Taitelbaum, Doron Kukliansy, Vered Cohen, Thomas Scialom, Idan Szpektor, Avinatan Hassidim, Yossi Matias
Grounded text generation systems often generate text that contains factual inconsistencies, hindering their real-world applicability.
2 code implementations • NAACL 2022 • Soravit Changpinyo, Doron Kukliansky, Idan Szpektor, Xi Chen, Nan Ding, Radu Soricut
Visual Question Answering (VQA) has benefited from increasingly sophisticated models, but has not enjoyed the same level of engagement in terms of data creation.
no code implementations • 24 May 2022 • Itay Harel, Hagai Taitelbaum, Idan Szpektor, Oren Kurland
We report the performance of several retrieval baselines, including neural retrieval models, over the dataset.
1 code implementation • 29 Jun 2022 • Zorik Gekhman, Nadav Oved, Orgad Keller, Idan Szpektor, Roi Reichart
We find that high benchmark scores do not necessarily translate to strong robustness, and that various methods can perform extremely differently under different settings.
no code implementations • 25 Jul 2022 • Deborah Cohen, MoonKyung Ryu, Yinlam Chow, Orgad Keller, Ido Greenberg, Avinatan Hassidim, Michael Fink, Yossi Matias, Idan Szpektor, Craig Boutilier, Gal Elidan
Despite recent advances in natural language understanding and generation, and decades of research on the development of conversational bots, building automated agents that can carry on rich open-ended conversations with humans "in the wild" remains a formidable challenge.
1 code implementation • 12 Sep 2022 • Soravit Changpinyo, Linting Xue, Michal Yarom, Ashish V. Thapliyal, Idan Szpektor, Julien Amelot, Xi Chen, Radu Soricut
In this paper, we propose scalable solutions to multilingual visual question answering (mVQA), on both data and modeling fronts.
1 code implementation • 10 Nov 2022 • Ella Neeman, Roee Aharoni, Or Honovich, Leshem Choshen, Idan Szpektor, Omri Abend
Question answering models commonly have access to two sources of "knowledge" during inference time: (1) parametric knowledge - the factual knowledge encoded in the model weights, and (2) contextual knowledge - external knowledge (e. g., a Wikipedia passage) given to the model to generate a grounded answer.
no code implementations • 19 Dec 2022 • Matan Eyal, Hila Noga, Roee Aharoni, Idan Szpektor, Reut Tsarfaty
We demonstrate that by casting tasks in the Hebrew NLP pipeline as text-to-text tasks, we can leverage powerful multilingual, pretrained sequence-to-sequence models as mT5, eliminating the need for a specialized, morpheme-based, separately fine-tuned decoder.
1 code implementation • NeurIPS 2023 • Michal Yarom, Yonatan Bitton, Soravit Changpinyo, Roee Aharoni, Jonathan Herzig, Oran Lang, Eran Ofek, Idan Szpektor
Automatically determining whether a text and a corresponding image are semantically aligned is a significant challenge for vision-language models, with applications in generative text-to-image and image-to-text tasks.
Ranked #11 on Visual Reasoning on Winoground
1 code implementation • 18 May 2023 • Zorik Gekhman, Jonathan Herzig, Roee Aharoni, Chen Elkind, Idan Szpektor
Factual consistency evaluation is often conducted using Natural Language Inference (NLI) models, yet these models exhibit limited success in evaluating summaries.
no code implementations • 24 May 2023 • Rodrigo Valerio, Joao Bordalo, Michal Yarom, Yonatan Bitton, Idan Szpektor, Joao Magalhaes
In this paper, we propose to strengthen the consistency property of T2I methods in the presence of natural complex language, which often breaks the limits of T2I methods by including non-visual information, and textual elements that require knowledge for accurate generation.
no code implementations • 31 May 2023 • Paul Roit, Johan Ferret, Lior Shani, Roee Aharoni, Geoffrey Cideron, Robert Dadashi, Matthieu Geist, Sertan Girgin, Léonard Hussenot, Orgad Keller, Nikola Momchev, Sabela Ramos, Piotr Stanczyk, Nino Vieillard, Olivier Bachem, Gal Elidan, Avinatan Hassidim, Olivier Pietquin, Idan Szpektor
Despite the seeming success of contemporary grounded text generation systems, they often tend to generate factually inconsistent text with respect to their input.
Abstractive Text Summarization Natural Language Inference +2
1 code implementation • 15 Nov 2023 • Hritik Bansal, Yonatan Bitton, Idan Szpektor, Kai-Wei Chang, Aditya Grover
Despite being (pre)trained on a massive amount of data, state-of-the-art video-language alignment models are not robust to semantically-plausible contrastive changes in the video captions.
no code implementations • 5 Dec 2023 • Brian Gordon, Yonatan Bitton, Yonatan Shafir, Roopal Garg, Xi Chen, Dani Lischinski, Daniel Cohen-Or, Idan Szpektor
While existing image-text alignment models reach high quality binary assessments, they fall short of pinpointing the exact source of misalignment.
no code implementations • 3 Jan 2024 • Uri Shaham, Jonathan Herzig, Roee Aharoni, Idan Szpektor, Reut Tsarfaty, Matan Eyal
As instruction-tuned large language models (LLMs) gain global adoption, their ability to follow instructions in multiple languages becomes increasingly crucial.
no code implementations • 10 Mar 2024 • Omer Goldman, Avi Caciularu, Matan Eyal, Kris Cao, Idan Szpektor, Reut Tsarfaty
Despite it being the cornerstone of BPE, the most common tokenization algorithm, the importance of compression in the tokenization process is still unclear.
1 code implementation • 15 Apr 2024 • Adi Simhi, Jonathan Herzig, Idan Szpektor, Yonatan Belinkov
In this work, we first introduce an approach for constructing datasets based on the model knowledge for detection and intervention methods in closed-book and open-book question-answering settings.
no code implementations • EMNLP 2021 • Or Honovich, Leshem Choshen, Roee Aharoni, Ella Neeman, Idan Szpektor, Omri Abend
Neural knowledge-grounded generative models for dialogue often produce content that is factually inconsistent with the knowledge they rely on, making them unreliable and limiting their applicability.
Abstractive Text Summarization Natural Language Inference +3