Search Results for author: Ido Dagan

Found 121 papers, 53 papers with code

Step-by-Step: Separating Planning from Realization in Neural Data-to-Text Generation

1 code implementation NAACL 2019 Amit Moryossef, Yoav Goldberg, Ido Dagan

We propose to split the generation process into a symbolic text-planning stage that is faithful to the input, followed by a neural generation stage that focuses only on realization.

Data-to-Text Generation Graph-to-Sequence

Improving Quality and Efficiency in Plan-based Neural Data-to-Text Generation

1 code implementation WS 2019 Amit Moryossef, Ido Dagan, Yoav Goldberg

We follow the step-by-step approach to neural data-to-text generation we proposed in Moryossef et al (2019), in which the generation process is divided into a text-planning stage followed by a plan-realization stage.

Data-to-Text Generation Referring Expression +1

Improving Hypernymy Detection with an Integrated Path-based and Distributional Method

1 code implementation ACL 2016 Vered Shwartz, Yoav Goldberg, Ido Dagan

Detecting hypernymy relations is a key task in NLP, which is addressed in the literature using two complementary approaches.

CogALex-V Shared Task: LexNET - Integrated Path-based and Distributional Method for the Identification of Semantic Relations

1 code implementation WS 2016 Vered Shwartz, Ido Dagan

The reported results in the shared task bring this submission to the third place on subtask 1 (word relatedness), and the first place on subtask 2 (semantic relation classification), demonstrating the utility of integrating the complementary path-based and distributional information sources in recognizing concrete semantic relations.

Classification General Classification +2

CDLM: Cross-Document Language Modeling

2 code implementations Findings (EMNLP) 2021 Avi Caciularu, Arman Cohan, Iz Beltagy, Matthew E. Peters, Arie Cattan, Ido Dagan

We introduce a new pretraining approach geared for multi-document language modeling, incorporating two key ideas into the masked language modeling self-supervised objective.

Citation Recommendation Coreference Resolution +6

Crowdsourcing Question-Answer Meaning Representations

1 code implementation NAACL 2018 Julian Michael, Gabriel Stanovsky, Luheng He, Ido Dagan, Luke Zettlemoyer

We introduce Question-Answer Meaning Representations (QAMRs), which represent the predicate-argument structure of a sentence as a set of question-answer pairs.

Sentence

Better Rewards Yield Better Summaries: Learning to Summarise Without References

2 code implementations IJCNLP 2019 Florian Böhm, Yang Gao, Christian M. Meyer, Ori Shapira, Ido Dagan, Iryna Gurevych

Human evaluation experiments show that, compared to the state-of-the-art supervised-learning systems and ROUGE-as-rewards RL summarisation systems, the RL systems using our learned rewards during training generate summarieswith higher human ratings.

Reinforcement Learning (RL)

CoRefi: A Crowd Sourcing Suite for Coreference Annotation

2 code implementations EMNLP 2020 Aaron Bornstein, Arie Cattan, Ido Dagan

Coreference annotation is an important, yet expensive and time consuming, task, which often involved expert annotators trained on complex decision guidelines.

Cross Document Coreference Resolution

A Consolidated Open Knowledge Representation for Multiple Texts

1 code implementation WS 2017 Rachel Wities, Vered Shwartz, Gabriel Stanovsky, Meni Adler, Ori Shapira, Shyam Upadhyay, Dan Roth, Eugenio Martinez Camara, Iryna Gurevych, Ido Dagan

We propose to move from Open Information Extraction (OIE) ahead to Open Knowledge Representation (OKR), aiming to represent information conveyed jointly in a set of texts in an open text-based manner.

Lexical Entailment Open Information Extraction

Streamlining Cross-Document Coreference Resolution: Evaluation and Modeling

2 code implementations23 Sep 2020 Arie Cattan, Alon Eirew, Gabriel Stanovsky, Mandar Joshi, Ido Dagan

Recent evaluation protocols for Cross-document (CD) coreference resolution have often been inconsistent or lenient, leading to incomparable results across works and overestimation of performance.

coreference-resolution Cross Document Coreference Resolution +2

Cross-document Coreference Resolution over Predicted Mentions

1 code implementation Findings (ACL) 2021 Arie Cattan, Alon Eirew, Gabriel Stanovsky, Mandar Joshi, Ido Dagan

Here, we introduce the first end-to-end model for CD coreference resolution from raw text, which extends the prominent model for within-document coreference to the CD setting.

coreference-resolution Cross Document Coreference Resolution

Realistic Evaluation Principles for Cross-document Coreference Resolution

1 code implementation Joint Conference on Lexical and Computational Semantics 2021 Arie Cattan, Alon Eirew, Gabriel Stanovsky, Mandar Joshi, Ido Dagan

We point out that common evaluation practices for cross-document coreference resolution have been unrealistically permissive in their assumed settings, yielding inflated results.

coreference-resolution Cross Document Coreference Resolution

Still a Pain in the Neck: Evaluating Text Representations on Lexical Composition

1 code implementation TACL 2019 Vered Shwartz, Ido Dagan

Building meaningful phrase representations is challenging because phrase meanings are not simply the sum of their constituent meanings.

Word Embeddings

Summary-Source Proposition-level Alignment: Task, Datasets and Supervised Baseline

1 code implementation CoNLL (EMNLP) 2021 Ori Ernst, Ori Shapira, Ramakanth Pasunuru, Michael Lepioshkin, Jacob Goldberger, Mohit Bansal, Ido Dagan

Aligning sentences in a reference summary with their counterparts in source documents was shown as a useful auxiliary summarization task, notably for generating training data for salience detection.

Clustering Document Summarization +1

QANom: Question-Answer driven SRL for Nominalizations

1 code implementation COLING 2020 Ayal Klein, Jonathan Mamou, Valentina Pyatkin, Daniela Stepanov, Hangfeng He, Dan Roth, Luke Zettlemoyer, Ido Dagan

We propose a new semantic scheme for capturing predicate-argument relations for nominalizations, termed QANom.

Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations

1 code implementation ACL 2018 Vered Shwartz, Ido Dagan

Revealing the implicit semantic relation between the constituents of a noun-compound is important for many NLP applications.

General Classification

QASem Parsing: Text-to-text Modeling of QA-based Semantics

1 code implementation23 May 2022 Ayal Klein, Eran Hirsch, Ron Eliav, Valentina Pyatkin, Avi Caciularu, Ido Dagan

Several recent works have suggested to represent semantic relations with questions and answers, decomposing textual information into separate interrogative natural language statements.

Data Augmentation

WEC: Deriving a Large-scale Cross-document Event Coreference dataset from Wikipedia

2 code implementations NAACL 2021 Alon Eirew, Arie Cattan, Ido Dagan

To complement these resources and enhance future research, we present Wikipedia Event Coreference (WEC), an efficient methodology for gathering a large-scale dataset for cross-document event coreference from Wikipedia, where coreference links are not restricted within predefined topics.

coreference-resolution Event Coreference Resolution

Proposition-Level Clustering for Multi-Document Summarization

2 code implementations NAACL 2022 Ori Ernst, Avi Caciularu, Ori Shapira, Ramakanth Pasunuru, Mohit Bansal, Jacob Goldberger, Ido Dagan

Text clustering methods were traditionally incorporated into multi-document summarization (MDS) as a means for coping with considerable information repetition.

Clustering Document Summarization +3

Cross-document Event Coreference Search: Task, Dataset and Modeling

1 code implementation23 Oct 2022 Alon Eirew, Avi Caciularu, Ido Dagan

The task of Cross-document Coreference Resolution has been traditionally formulated as requiring to identify all coreference links across a given set of documents.

Cross Document Coreference Resolution Open-Domain Question Answering +2

Asking It All: Generating Contextualized Questions for any Semantic Role

1 code implementation EMNLP 2021 Valentina Pyatkin, Paul Roit, Julian Michael, Reut Tsarfaty, Yoav Goldberg, Ido Dagan

We develop a two-stage model for this task, which first produces a context-independent question prototype for each role and then revises it to be contextually appropriate for the passage.

Question Generation Question-Generation

Multi-Document Keyphrase Extraction: Dataset, Baselines and Review

1 code implementation3 Oct 2021 Ori Shapira, Ramakanth Pasunuru, Ido Dagan, Yael Amsterdamer

Keyphrase extraction has been extensively researched within the single-document setting, with an abundance of methods, datasets and applications.

Keyphrase Extraction

How "Multi" is Multi-Document Summarization?

1 code implementation23 Oct 2022 Ruben Wolhandler, Arie Cattan, Ori Ernst, Ido Dagan

To that end, we propose an automated measure for evaluating the degree to which a summary is ``disperse'', in the sense of the number of source documents needed to cover its content.

Document Summarization Multi-Document Summarization

QADiscourse -- Discourse Relations as QA Pairs: Representation, Crowdsourcing and Baselines

1 code implementation6 Oct 2020 Valentina Pyatkin, Ayal Klein, Reut Tsarfaty, Ido Dagan

Discourse relations describe how two propositions relate to one another, and identifying them automatically is an integral part of natural language understanding.

Natural Language Understanding Sentence

Extending Multi-Document Summarization Evaluation to the Interactive Setting

1 code implementation NAACL 2021 Ori Shapira, Ramakanth Pasunuru, Hadar Ronen, Mohit Bansal, Yael Amsterdamer, Ido Dagan

In this paper, we develop an end-to-end evaluation framework for interactive summarization, focusing on expansion-based interaction, which considers the accumulating information along a user session.

Document Summarization Multi-Document Summarization

Controlled Text Reduction

2 code implementations24 Oct 2022 Aviv Slobodkin, Paul Roit, Eran Hirsch, Ori Ernst, Ido Dagan

Producing a reduced version of a source text, as in generic or focused summarization, inherently involves two distinct subtasks: deciding on targeted content and generating a coherent text conveying it.

Peek Across: Improving Multi-Document Modeling via Cross-Document Question-Answering

1 code implementation24 May 2023 Avi Caciularu, Matthew E. Peters, Jacob Goldberger, Ido Dagan, Arman Cohan

The integration of multi-document pre-training objectives into language models has resulted in remarkable improvements in multi-document downstream tasks.

Query-focused Summarization Question Answering +2

CD2CR: Co-reference Resolution Across Documents and Domains

1 code implementation29 Jan 2021 James Ravenscroft, Arie Cattan, Amanda Clare, Ido Dagan, Maria Liakata

Cross-document co-reference resolution (CDCR) is the task of identifying and linking mentions to entities and concepts across many text documents.

Design Choices for Crowdsourcing Implicit Discourse Relations: Revealing the Biases Introduced by Task Design

1 code implementation3 Apr 2023 Valentina Pyatkin, Frances Yung, Merel C. J. Scholman, Reut Tsarfaty, Ido Dagan, Vera Demberg

Disagreement in natural language annotation has mostly been studied from a perspective of biases introduced by the annotators and the annotation frameworks.

Optimizing Retrieval-augmented Reader Models via Token Elimination

1 code implementation20 Oct 2023 Moshe Berchansky, Peter Izsak, Avi Caciularu, Ido Dagan, Moshe Wasserblat

Fusion-in-Decoder (FiD) is an effective retrieval-augmented language model applied across a variety of open-domain tasks, such as question answering, fact checking, etc.

Answer Generation Fact Checking +3

Diversify Your Datasets: Analyzing Generalization via Controlled Variance in Adversarial Datasets

1 code implementation CONLL 2019 Ohad Rozen, Vered Shwartz, Roee Aharoni, Ido Dagan

Phenomenon-specific "adversarial" datasets have been recently designed to perform targeted stress-tests for particular inference types.

OpenAsp: A Benchmark for Multi-document Open Aspect-based Summarization

1 code implementation7 Dec 2023 Shmuel Amar, Liat Schiff, Ori Ernst, Asi Shefer, Ori Shapira, Ido Dagan

To advance research on more realistic scenarios, we introduce OpenAsp, a benchmark for multi-document \textit{open} aspect-based summarization.

Document Summarization Multi-Document Summarization

Teach the Rules, Provide the Facts: Targeted Relational-knowledge Enhancement for Textual Inference

1 code implementation Joint Conference on Lexical and Computational Semantics 2021 Ohad Rozen, Shmuel Amar, Vered Shwartz, Ido Dagan

Our approach facilitates learning generic inference patterns requiring relational knowledge (e. g. inferences related to hypernymy) during training, while injecting on-demand the relevant relational facts (e. g. pangolin is an animal) at test time.

Extending Multi-Text Sentence Fusion Resources via Pyramid Annotations

1 code implementation NAACL 2022 Daniela Brook Weiss, Paul Roit, Ori Ernst, Ido Dagan

NLP models that compare or consolidate information across multiple documents often struggle when challenged with recognizing substantial information redundancies across the texts.

Document Summarization Multi-Document Summarization +2

A Simple Language Model based on PMI Matrix Approximations

no code implementations EMNLP 2017 Oren Melamud, Ido Dagan, Jacob Goldberger

Specifically, we show that with minor modifications to word2vec's algorithm, we get principled language models that are closely related to the well-established Noise Contrastive Estimation (NCE) based language models.

Language Modelling Word Embeddings

PMI Matrix Approximations with Applications to Neural Language Modeling

no code implementations5 Sep 2016 Oren Melamud, Ido Dagan, Jacob Goldberger

The obtained language modeling is closely related to NCE language models but is based on a simplified objective function.

Language Modelling

Getting More Out Of Syntax with PropS

no code implementations4 Mar 2016 Gabriel Stanovsky, Jessica Ficler, Ido Dagan, Yoav Goldberg

Semantic NLP applications often rely on dependency trees to recognize major elements of the proposition structure of sentences.

Open Information Extraction

Term Set Expansion based on Multi-Context Term Embeddings: an End-to-end Workflow

no code implementations26 Jul 2018 Jonathan Mamou, Oren Pereg, Moshe Wasserblat, Ido Dagan, Yoav Goldberg, Alon Eirew, Yael Green, Shira Guskin, Peter Izsak, Daniel Korat

We present SetExpander, a corpus-based system for expanding a seed set of terms into a more complete set of terms that belong to the same semantic class.

Semantics as a Foreign Language

no code implementations EMNLP 2018 Gabriel Stanovsky, Ido Dagan

We propose a novel approach to semantic dependency parsing (SDP) by casting the task as an instance of multi-lingual machine translation, where each semantic representation is a different foreign dialect.

Dependency Parsing Machine Translation +3

Integrating Deep Linguistic Features in Factuality Prediction over Unified Datasets

no code implementations ACL 2017 Gabriel Stanovsky, Judith Eckle-Kohler, Yevgeniy Puzikov, Ido Dagan, Iryna Gurevych

Previous models for the assessment of commitment towards a predicate in a sentence (also known as factuality prediction) were trained and tested against a specific annotated dataset, subsequently limiting the generality of their results.

Knowledge Base Population Question Answering +1

Interactive Abstractive Summarization for Event News Tweets

no code implementations EMNLP 2017 Ori Shapira, Hadar Ronen, Meni Adler, Yael Amsterdamer, Judit Bar-Ilan, Ido Dagan

We present a novel interactive summarization system that is based on abstractive summarization, derived from a recent consolidated knowledge representation for multiple texts.

Abstractive Text Summarization Document Summarization +1

Modeling Extractive Sentence Intersection via Subtree Entailment

no code implementations COLING 2016 Omer Levy, Ido Dagan, Gabriel Stanovsky, Judith Eckle-Kohler, Iryna Gurevych

Sentence intersection captures the semantic overlap of two texts, generalizing over paradigms such as textual entailment and semantic text similarity.

Abstractive Text Summarization Natural Language Inference +2

Improving Distributional Similarity with Lessons Learned from Word Embeddings

no code implementations TACL 2015 Omer Levy, Yoav Goldberg, Ido Dagan

Recent trends suggest that neural-network-inspired word embedding models outperform traditional count-based distributional models on word similarity and analogy detection tasks.

Word Embeddings Word Similarity

Multi-Context Term Embeddings: the Use Case of Corpus-based Term Set Expansion

no code implementations WS 2019 Jonathan Mamou, Oren Pereg, Moshe Wasserblat, Ido Dagan

In this paper, we present a novel algorithm that combines multi-context term embeddings using a neural classifier and we test this approach on the use case of corpus-based term set expansion.

How to Compare Summarizers without Target Length? Pitfalls, Solutions and Re-Examination of the Neural Summarization Literature

no code implementations WS 2019 Simeng Sun, Ori Shapira, Ido Dagan, Ani Nenkova

We show that plain ROUGE F1 scores are not ideal for comparing current neural systems which on average produce different lengths.

The Negochat Corpus of Human-agent Negotiation Dialogues

no code implementations LREC 2016 Vasily Konovalov, Ron artstein, Oren Melamud, Ido Dagan

In this work, we introduce an annotated natural language human-agent dialogue corpus in the negotiation domain.

Natural Language Understanding

Revisiting the Binary Linearization Technique for Surface Realization

no code implementations WS 2019 Yevgeniy Puzikov, Claire Gardent, Ido Dagan, Iryna Gurevych

End-to-end neural approaches have achieved state-of-the-art performance in many natural language processing (NLP) tasks.

Decision Making

Evaluating Interactive Summarization: an Expansion-Based Framework

no code implementations17 Sep 2020 Ori Shapira, Ramakanth Pasunuru, Hadar Ronen, Mohit Bansal, Yael Amsterdamer, Ido Dagan

Allowing users to interact with multi-document summarizers is a promising direction towards improving and customizing summary results.

QADiscourse - Discourse Relations as QA Pairs: Representation, Crowdsourcing and Baselines

no code implementations EMNLP 2020 Valentina Pyatkin, Ayal Klein, Reut Tsarfaty, Ido Dagan

Discourse relations describe how two propositions relate to one another, and identifying them automatically is an integral part of natural language understanding.

Natural Language Understanding Sentence

Within-Between Lexical Relation Classification

no code implementations EMNLP 2020 Oren Barkan, Avi Caciularu, Ido Dagan

We propose the novel \textit{Within-Between} Relation model for recognizing lexical-semantic relations between words.

Classification General Classification +2

CD\^2CR: Co-reference resolution across documents and domains

no code implementations EACL 2021 James Ravenscroft, Amanda Clare, Arie Cattan, Ido Dagan, Maria Liakata

Cross-document co-reference resolution (CDCR) is the task of identifying and linking mentions to entities and concepts across many text documents.

Opinion-based Relational Pivoting for Cross-domain Aspect Term Extraction

no code implementations WASSA (ACL) 2022 Ayal Klein, Oren Pereg, Daniel Korat, Vasudev Lal, Moshe Wasserblat, Ido Dagan

In this paper, we investigate and establish empirically a prior conjecture, which suggests that the linguistic relations connecting opinion terms to their aspects transfer well across domains and therefore can be leveraged for cross-domain aspect term extraction.

Domain Adaptation Term Extraction

Design Choices in Crowdsourcing Discourse Relation Annotations: The Effect of Worker Selection and Training

no code implementations LREC 2022 Merel Scholman, Valentina Pyatkin, Frances Yung, Ido Dagan, Reut Tsarfaty, Vera Demberg

The current contribution studies the effect of worker selection and training on the agreement on implicit relation labels between workers and gold labels, for both the DC and the QA method.

Relation

Revisiting Sentence Union Generation as a Testbed for Text Consolidation

1 code implementation24 May 2023 Eran Hirsch, Valentina Pyatkin, Ruben Wolhandler, Avi Caciularu, Asi Shefer, Ido Dagan

In this paper, we suggest revisiting the sentence union generation task as an effective well-defined testbed for assessing text consolidation capabilities, decoupling the consolidation challenge from subjective content selection.

Document Summarization Long Form Question Answering +2

SummHelper: Collaborative Human-Computer Summarization

no code implementations16 Aug 2023 Aviv Slobodkin, Niv Nachum, Shmuel Amar, Ori Shapira, Ido Dagan

Current approaches for text summarization are predominantly automatic, with rather limited space for human intervention and control over the process.

Text Summarization

Multi-Review Fusion-in-Context

no code implementations22 Mar 2024 Aviv Slobodkin, Ori Shapira, Ran Levy, Ido Dagan

This study lays the groundwork for further exploration of modular text generation in the multi-document setting, offering potential improvements in the quality and reliability of generated content.

Long Form Question Answering Text Generation

Attribute First, then Generate: Locally-attributable Grounded Text Generation

no code implementations25 Mar 2024 Aviv Slobodkin, Eran Hirsch, Arie Cattan, Tal Schuster, Ido Dagan

Recent efforts to address hallucinations in Large Language Models (LLMs) have focused on attributed text generation, which supplements generated texts with citations of supporting sources for post-generation fact-checking and corrections.

Attribute Document Summarization +5

Cannot find the paper you are looking for? You can Submit a new open access paper.