Search Results for author: Iryna Gurevych

Found 294 papers, 147 papers with code

IMPLI: Investigating NLI Models’ Performance on Figurative Language

1 code implementation ACL 2022 Kevin Stowe, Prasetya Utama, Iryna Gurevych

Natural language inference (NLI) has been widely used as a task to train and evaluate models for language understanding.

Natural Language Inference

Exploring Metaphoric Paraphrase Generation

1 code implementation CoNLL (EMNLP) 2021 Kevin Stowe, Nils Beck, Iryna Gurevych

Metaphor generation is a difficult task, and has seen tremendous improvement with the advent of deep pretrained models.

Paraphrase Generation

Evaluating Coreference Resolvers on Community-based Question Answering: From Rule-based to State of the Art

1 code implementation COLING (CRAC) 2022 Haixia Chai, Nafise Sadat Moosavi, Iryna Gurevych, Michael Strube

The results of our extrinsic evaluation show that while there is a significant difference between the performance of the rule-based system vs. state-of-the-art neural model on coreference resolution datasets, we do not observe a considerable difference on their impact on downstream models.

Answer Selection coreference-resolution +2

Event Coreference Data (Almost) for Free: Mining Hyperlinks from Online News

1 code implementation AKBC 2021 Michael Bugert, Iryna Gurevych

Cross-document event coreference resolution (CDCR) is the task of identifying which event mentions refer to the same events throughout a collection of documents.

coreference-resolution Coreference Resolution +1

UKP-SQuARE: An Interactive Tool for Teaching Question Answering

no code implementations31 May 2023 Haishuo Fang, Haritz Puerto, Iryna Gurevych

To evaluate the effectiveness of UKP-SQuARE in teaching scenarios, we adopted it in a postgraduate NLP course and surveyed the students after the course.

Information Retrieval Question Answering +1

Dior-CVAE: Diffusion Priors in Variational Dialog Generation

no code implementations24 May 2023 Tianyu Yang, Thy Thy Tran, Iryna Gurevych

Conditional variational autoencoders (CVAEs) have been used recently for diverse response generation, by introducing latent variables to represent the relationship between a dialog context and its potential responses.

Open-Domain Dialog Response Generation

DAPR: A Benchmark on Document-Aware Passage Retrieval

1 code implementation23 May 2023 Kexin Wang, Nils Reimers, Iryna Gurevych

To fill this gap, we propose and name this task Document-Aware Passage Retrieval (DAPR) and build a benchmark including multiple datasets from various domains, covering both DAPR and whole-document retrieval.

Passage Retrieval Retrieval

A Diachronic Analysis of the NLP Research Paradigm Shift: When, How, and Why?

no code implementations22 May 2023 Aniket Pramanick, Yufang Hou, Iryna Gurevych

Understanding the fundamental concepts and trends in a scientific field is crucial for keeping abreast of its ongoing development.

Causal Discovery

Romanization-based Large-scale Adaptation of Multilingual Language Models

no code implementations18 Apr 2023 Sukannya Purkayastha, Sebastian Ruder, Jonas Pfeiffer, Iryna Gurevych, Ivan Vulić

In order to boost the capacity of mPLMs to deal with low-resource and unseen languages, we explore the potential of leveraging transliteration on a massive scale.

Cross-Lingual Transfer Transliteration

UKP-SQuARE v3: A Platform for Multi-Agent QA Research

no code implementations31 Mar 2023 Haritz Puerto, Tim Baumgärtner, Rachneet Sachdeva, Haishuo Fang, Hao Zhang, Sewin Tariverdian, Kexin Wang, Iryna Gurevych

To ease research in multi-agent models, we extend UKP-SQuARE, an online platform for QA research, to support three families of multi-agent systems: i) agent selection, ii) early-fusion of agents, and iii) late-fusion of agents.

Question Answering

Elastic Weight Removal for Faithful and Abstractive Dialogue Generation

1 code implementation30 Mar 2023 Nico Daheim, Nouha Dziri, Mrinmaya Sachan, Iryna Gurevych, Edoardo M. Ponti

We evaluate our method -- using different variants of Flan-T5 as a backbone language model -- on multiple datasets for information-seeking dialogue generation and compare our method with state-of-the-art techniques for faithfulness, such as CTRL, Quark, DExperts, and Noisy Channel reranking.

Dialogue Generation Language Modelling

CARE: Collaborative AI-Assisted Reading Environment

1 code implementation24 Feb 2023 Dennis Zyska, Nils Dycke, Jan Buchmann, Ilia Kuznetsov, Iryna Gurevych

Recent years have seen impressive progress in AI-assisted writing, yet the developments in AI-assisted reading are lacking.

Question Answering text-classification +1

Like a Good Nearest Neighbor: Practical Content Moderation with Sentence Transformers

1 code implementation17 Feb 2023 Luke Bates, Iryna Gurevych

Modern text classification systems have impressive capabilities but are infeasible to deploy and use reliably due to their dependence on prompting and billion-parameter language models.

Contrastive Learning text-classification +1

Opportunities and Challenges in Neural Dialog Tutoring

1 code implementation24 Jan 2023 Jakub Macina, Nico Daheim, Lingzhi Wang, Tanmay Sinha, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan

Designing dialog tutors has been challenging as it involves modeling the diverse and complex pedagogical strategies employed by human tutors.

Improving Generalization of Adapter-Based Cross-lingual Transfer with Scheduled Unfreezing

no code implementations13 Jan 2023 Chen Cecilia Liu, Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych

Our in-depth experiments reveal that scheduled unfreezing induces different learning dynamics compared to standard fine-tuning, and provide evidence that the dynamics of Fisher Information during training correlate with cross-lingual generalization performance.

Cross-Lingual Transfer Transfer Learning

Python Code Generation by Asking Clarification Questions

1 code implementation19 Dec 2022 Haau-Sing Li, Mohsen Mesgar, André F. T. Martins, Iryna Gurevych

We hypothesize that the under-specification of a natural language description can be resolved by asking clarification questions.

Code Generation Language Modelling

CiteBench: A benchmark for Scientific Citation Text Generation

1 code implementation19 Dec 2022 Martin Funkquist, Ilia Kuznetsov, Yufang Hou, Iryna Gurevych

To address this challenge, we propose CiteBench: a benchmark for citation text generation that unifies multiple diverse datasets and enables standardized evaluation of citation text generation models across task designs and domains.

Text Generation

NLP meets psychotherapy: Using predicted client emotions and self-reported client emotions to measure emotional coherence

no code implementations22 Nov 2022 Neha Warikoo, Tobias Mayer, Dana Atzil-Slonim, Amir Eliassaf, Shira Haimovitz, Iryna Gurevych

No study has examined EC between the subjective experience of emotions and emotion expression in therapy or whether this coherence is associated with clients' well being.

Emotion Recognition

GDPR Compliant Collection of Therapist-Patient-Dialogues

no code implementations22 Nov 2022 Tobias Mayer, Neha Warikoo, Oliver Grimm, Andreas Reif, Iryna Gurevych

While these conversations are part of the daily routine of clinicians, gathering them is usually hindered by various ethical (purpose of data usage), legal (data privacy) and technical (data formatting) limitations.

NLPeer: A Unified Resource for the Computational Study of Peer Review

1 code implementation12 Nov 2022 Nils Dycke, Ilia Kuznetsov, Iryna Gurevych

Peer review constitutes a core component of scholarly publishing; yet it demands substantial expertise and training, and is susceptible to errors and biases.

An Inclusive Notion of Text

no code implementations10 Nov 2022 Ilia Kuznetsov, Iryna Gurevych

Natural language processing (NLP) researchers develop models of grammar, meaning and communication based on written text.

Effective Cross-Task Transfer Learning for Explainable Natural Language Inference with T5

1 code implementation31 Oct 2022 Irina Bigoulaeva, Rachneet Sachdeva, Harish Tayyar Madabushi, Aline Villavicencio, Iryna Gurevych

We compare sequential fine-tuning with a model for multi-task learning in the context where we are interested in boosting performance on two tasks, one of which depends on the other.

Multi-Task Learning Natural Language Inference

Missing Counter-Evidence Renders NLP Fact-Checking Unrealistic for Misinformation

1 code implementation25 Oct 2022 Max Glockner, Yufang Hou, Iryna Gurevych

In our analysis, we show that, by design, existing NLP task definitions for fact-checking cannot refute misinformation as professional fact-checkers do for the majority of claims.

Fact Checking Misinformation

Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking

1 code implementation19 Oct 2022 Tim Baumgärtner, Leonardo F. R. Ribeiro, Nils Reimers, Iryna Gurevych

Pairing a lexical retriever with a neural re-ranking model has set state-of-the-art performance on large-scale information retrieval datasets.

Argument Retrieval Information Retrieval +4

One does not fit all! On the Complementarity of Vision Encoders for Vision and Language Tasks

no code implementations12 Oct 2022 Gregor Geigle, Chen Liu, Jonas Pfeiffer, Iryna Gurevych

Nonetheless, most current work assumes that a \textit{single} pre-trained VE can serve as a general-purpose encoder.

The Devil is in the Details: On Models and Training Regimes for Few-Shot Intent Classification

no code implementations12 Oct 2022 Mohsen Mesgar, Thy Thy Tran, Goran Glavas, Iryna Gurevych

First, the unexplored combination of the cross-encoder architecture (with parameterized similarity scoring function) and episodic meta-learning consistently yields the best FSIC performance.

intent-classification Intent Classification +1

Transformers with Learnable Activation Functions

2 code implementations30 Aug 2022 Haishuo Fang, Ji-Ung Lee, Nafise Sadat Moosavi, Iryna Gurevych

In contrast to conventional, predefined activation functions, RAFs can adaptively learn optimal activation functions during training according to input data.

UKP-SQuARE v2: Explainability and Adversarial Attacks for Trustworthy QA

1 code implementation19 Aug 2022 Rachneet Sachdeva, Haritz Puerto, Tim Baumgärtner, Sewin Tariverdian, Hao Zhang, Kexin Wang, Hossain Shaikh Saadi, Leonardo F. R. Ribeiro, Iryna Gurevych

In this paper, we introduce SQuARE v2, the new version of SQuARE, to provide an explainability infrastructure for comparing models based on methods such as saliency maps and graph-based explanations.

Adversarial Attack Explainable Models +2

TexPrax: A Messaging Application for Ethical, Real-time Data Collection and Annotation

1 code implementation16 Aug 2022 Lorenz Stangier, Ji-Ung Lee, Yuxi Wang, Marvin Müller, Nicholas Frick, Joachim Metternich, Iryna Gurevych

We evaluate TexPrax in a user-study with German factory employees who ask their colleagues for solutions on problems that arise during their daily work.


Mining Legal Arguments in Court Decisions

1 code implementation12 Aug 2022 Ivan Habernal, Daniel Faber, Nicola Recchia, Sebastian Bretthauer, Iryna Gurevych, Indra Spiecker genannt Döhmann, Christoph Burchard

Identifying, classifying, and analyzing arguments in legal discourse has been a prominent area of research since the inception of the argument mining field.

Argument Mining

Annotation Error Detection: Analyzing the Past and Present for a More Coherent Future

1 code implementation5 Jun 2022 Jan-Christoph Klie, Bonnie Webber, Iryna Gurevych

While researchers show that their approaches work well on their newly introduced datasets, they rarely compare their methods to previous work or on the same datasets.

text-classification Text Classification

On the Effect of Sample and Topic Sizes for Argument Mining Datasets

no code implementations23 May 2022 Benjamin Schiller, Johannes Daxenberger, Iryna Gurevych

The task of Argument Mining, that is extracting argumentative sentences for a specific topic from large document sources, is an inherently difficult task for machine learning models and humans alike, as large datasets are rare and recognition of argumentative sentences requires expert knowledge.

Argument Mining Benchmarking +1

Improving the Numerical Reasoning Skills of Pretrained Language Models

no code implementations13 May 2022 Dominic Petrak, Nafise Sadat Moosavi, Iryna Gurevych

We evaluate our approach on three different tasks that require numerical reasoning, including (a) reading comprehension in the DROP dataset, (b) inference-on-tables in the InfoTabs dataset, and (c) table-to-text generation in WikiBio and SciGen datasets.

Contrastive Learning Reading Comprehension +1

Adaptable Adapters

1 code implementation NAACL 2022 Nafise Sadat Moosavi, Quentin Delfosse, Kristian Kersting, Iryna Gurevych

The resulting adapters (a) contain about 50% of the learning parameters of the standard adapter and are therefore more efficient at training and inference, and require less storage space, and (b) achieve considerably higher performances in low-data settings.

Revise and Resubmit: An Intertextual Model of Text-based Collaboration in Peer Review

1 code implementation22 Apr 2022 Ilia Kuznetsov, Jan Buchmann, Max Eichler, Iryna Gurevych

While existing NLP studies focus on the analysis of individual texts, editorial assistance often requires modeling interactions between pairs of texts -- yet general frameworks and datasets to support this scenario are missing.

FactGraph: Evaluating Factuality in Summarization with Semantic Graph Representations

1 code implementation NAACL 2022 Leonardo F. R. Ribeiro, Mengwen Liu, Iryna Gurevych, Markus Dreyer, Mohit Bansal

Despite recent improvements in abstractive summarization, most current approaches generate summaries that are not factually consistent with the source document, severely restricting their trust and usage in real-world applications.

Abstractive Text Summarization

UKP-SQUARE: An Online Platform for Question Answering Research

1 code implementation ACL 2022 Tim Baumgärtner, Kexin Wang, Rachneet Sachdeva, Max Eichler, Gregor Geigle, Clifton Poth, Hannah Sterz, Haritz Puerto, Leonardo F. R. Ribeiro, Jonas Pfeiffer, Nils Reimers, Gözde Gül Şahin, Iryna Gurevych

Recent advances in NLP and information retrieval have given rise to a diverse set of question answering tasks that are of different formats (e. g., extractive, abstractive), require different model architectures (e. g., generative, discriminative), and setups (e. g., with or without retrieval).

Explainable Models Information Retrieval +2

Delving Deeper into Cross-lingual Visual Question Answering

no code implementations15 Feb 2022 Chen Liu, Jonas Pfeiffer, Anna Korhonen, Ivan Vulic, Iryna Gurevych

Previous work on cross-lingual VQA has reported poor zero-shot transfer performance of current multilingual multimodal Transformers and large gaps to monolingual performance, attributed mostly to misalignment of text embeddings between the source and target languages, without providing any additional deeper analyses.

Question Answering Visual Question Answering

ArgSciChat: A Dataset for Argumentative Dialogues on Scientific Papers

2 code implementations14 Feb 2022 Federico Ruggeri, Mohsen Mesgar, Iryna Gurevych

The applications of conversational agents for scientific disciplines (as expert domains) are understudied due to the lack of dialogue data to train such agents.

Fact Selection Response Generation

Yes-Yes-Yes: Proactive Data Collection for ACL Rolling Review and Beyond

1 code implementation27 Jan 2022 Nils Dycke, Ilia Kuznetsov, Iryna Gurevych

The shift towards publicly available text sources has enabled language processing at unprecedented scale, yet leaves under-serviced the domains where public and openly licensed data is scarce.

MetaQA: Combining Expert Agents for Multi-Skill Question Answering

1 code implementation3 Dec 2021 Haritz Puerto, Gözde Gül Şahin, Iryna Gurevych

The recent explosion of question answering (QA) datasets and models has increased the interest in the generalization of models across multiple domains and formats by either training on multiple datasets or by combining multiple models.

Question Answering

Avoiding Inference Heuristics in Few-shot Prompt-based Finetuning

1 code implementation EMNLP 2021 Prasetya Ajie Utama, Nafise Sadat Moosavi, Victor Sanh, Iryna Gurevych

Recent prompt-based approaches allow pretrained language models to achieve strong performances on few-shot finetuning by reformulating downstream tasks as a language modeling problem.

Language Modelling

TxT: Crossmodal End-to-End Learning with Transformers

no code implementations9 Sep 2021 Jan-Martin O. Steitz, Jonas Pfeiffer, Iryna Gurevych, Stefan Roth

Reasoning over multiple modalities, e. g. in Visual Question Answering (VQA), requires an alignment of semantic concepts across domains.

Question Answering Visual Question Answering

Assisting Decision Making in Scholarly Peer Review: A Preference Learning Perspective

no code implementations2 Sep 2021 Nils Dycke, Edwin Simpson, Ilia Kuznetsov, Iryna Gurevych

Peer review is the primary means of quality control in academia; as an outcome of a peer review process, program and area chairs make acceptance decisions for each paper based on the review reports and scores they received.

Decision Making Fairness

AdapterHub Playground: Simple and Flexible Few-Shot Learning with Adapters

1 code implementation ACL 2022 Tilman Beck, Bela Bohlender, Christina Viehmann, Vincent Hane, Yanik Adamson, Jaber Khuri, Jonas Brossmann, Jonas Pfeiffer, Iryna Gurevych

The open-access dissemination of pretrained language models through online repositories has led to a democratization of state-of-the-art natural language processing (NLP) research.

Few-Shot Learning Transfer Learning

Scientia Potentia Est -- On the Role of Knowledge in Computational Argumentation

no code implementations1 Jul 2021 Anne Lauscher, Henning Wachsmuth, Iryna Gurevych, Goran Glavaš

Despite extensive research efforts in recent years, computational argumentation (CA) remains one of the most challenging areas of natural language processing.

Common Sense Reasoning Natural Language Understanding

Annotation Curricula to Implicitly Train Non-Expert Annotators

1 code implementation CL (ACL) 2022 Ji-Ung Lee, Jan-Christoph Klie, Iryna Gurevych

Annotation studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain.

Metaphor Generation with Conceptual Mappings

1 code implementation ACL 2021 Kevin Stowe, Tuhin Chakrabarty, Nanyun Peng, Smaranda Muresan, Iryna Gurevych

Guided by conceptual metaphor theory, we propose to control the generation process by encoding conceptual mappings between cognitive domains to generate meaningful metaphoric expressions.

Investigating label suggestions for opinion mining in German Covid-19 social media

1 code implementation ACL 2021 Tilman Beck, Ji-Ung Lee, Christina Viehmann, Marcus Maurer, Oliver Quiring, Iryna Gurevych

This work investigates the use of interactively updated label suggestions to improve upon the efficiency of gathering annotations on the task of opinion mining in German Covid-19 social media data.

Opinion Mining Transfer Learning

BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models

2 code implementations17 Apr 2021 Nandan Thakur, Nils Reimers, Andreas Rücklé, Abhishek Srivastava, Iryna Gurevych

To address this, and to facilitate researchers to broadly evaluate the effectiveness of their models, we introduce Benchmarking-IR (BEIR), a robust and heterogeneous evaluation benchmark for information retrieval.

Argument Retrieval Benchmarking +12

Learning to Reason for Text Generation from Scientific Tables

1 code implementation16 Apr 2021 Nafise Sadat Moosavi, Andreas Rücklé, Dan Roth, Iryna Gurevych

In this paper, we introduce SciGen, a new challenge dataset for the task of reasoning-aware data-to-text generation consisting of tables from scientific articles and their corresponding descriptions.

Arithmetic Reasoning Data-to-Text Generation

What to Pre-Train on? Efficient Intermediate Task Selection

1 code implementation EMNLP 2021 Clifton Poth, Jonas Pfeiffer, Andreas Rücklé, Iryna Gurevych

Our best methods achieve an average Regret@3 of less than 1% across all target tasks, demonstrating that we are able to efficiently identify the best datasets for intermediate training.

Multiple-choice Question Answering +1

AmbiFC: Fact-Checking Ambiguous Claims with Evidence

2 code implementations1 Apr 2021 Max Glockner, Ieva Staliūnaitė, James Thorne, Gisela Vallejo, Andreas Vlachos, Iryna Gurevych

We present AmbiFC, a large-scale fact-checking dataset with realistic claims derived from real-world information needs.

Claim Verification Evidence Selection +3

Retrieve Fast, Rerank Smart: Cooperative and Joint Approaches for Improved Cross-Modal Retrieval

1 code implementation22 Mar 2021 Gregor Geigle, Jonas Pfeiffer, Nils Reimers, Ivan Vulić, Iryna Gurevych

Current state-of-the-art approaches to cross-modal retrieval process text and visual input jointly, relying on Transformer-based architectures with cross-attention mechanisms that attend over all words and objects in an image.

Cross-Modal Retrieval Retrieval

Structural Adapters in Pretrained Language Models for AMR-to-text Generation

1 code implementation EMNLP 2021 Leonardo F. R. Ribeiro, Yue Zhang, Iryna Gurevych

Pretrained language models (PLM) have recently advanced graph-to-text generation, where the input graph is linearized into a sequence and fed into the PLM to obtain its representation.

AMR-to-Text Generation Data-to-Text Generation

Focusing Knowledge-based Graph Argument Mining via Topic Modeling

no code implementations3 Feb 2021 Patrick Abels, Zahra Ahmadi, Sophie Burkhardt, Benjamin Schiller, Iryna Gurevych, Stefan Kramer

We use a topic model to extract topic- and sentence-specific evidence from the structured knowledge base Wikidata, building a graph based on the cosine similarity between the entity word vectors of Wikidata and the vector of the given sentence.

Argument Mining Decision Making +1

Empirical Evaluation of Supervision Signals for Style Transfer Models

no code implementations15 Jan 2021 Yevgeniy Puzikov, Simoes Stanley, Iryna Gurevych, Immanuel Schweizer

In this work we empirically compare the dominant optimization paradigms which provide supervision signals during training: backtranslation, adversarial training and reinforcement learning.

Machine Translation reinforcement-learning +4

Coreference Reasoning in Machine Reading Comprehension

1 code implementation ACL 2021 Mingzhu Wu, Nafise Sadat Moosavi, Dan Roth, Iryna Gurevych

We propose a methodology for creating MRC datasets that better reflect the challenges of coreference reasoning and use it to create a sample evaluation set.

coreference-resolution Coreference Resolution +3

How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models

1 code implementation ACL 2021 Phillip Rust, Jonas Pfeiffer, Ivan Vulić, Sebastian Ruder, Iryna Gurevych

In this work, we provide a systematic and comprehensive empirical comparison of pretrained multilingual language models versus their monolingual counterparts with regard to their monolingual task performance.

Pretrained Multilingual Language Models

UNKs Everywhere: Adapting Multilingual Language Models to New Scripts

2 code implementations EMNLP 2021 Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder

The ultimate challenge is dealing with under-resourced languages not covered at all by the models and written in scripts unseen during pretraining.

Cross-Lingual Transfer

The Curse of Dense Low-Dimensional Information Retrieval for Large Index Sizes

no code implementations ACL 2021 Nils Reimers, Iryna Gurevych

Information Retrieval using dense low-dimensional representations recently became popular and showed out-performance to traditional sparse-representations like BM25.

Information Retrieval Retrieval

Generalizing Cross-Document Event Coreference Resolution Across Multiple Corpora

1 code implementation CL (ACL) 2021 Michael Bugert, Nils Reimers, Iryna Gurevych

This raises strong concerns on their generalizability -- a must-have for downstream applications where the magnitude of domains or event mentions is likely to exceed those found in a curated corpus.

coreference-resolution Coreference Resolution +1

Ranking Creative Language Characteristics in Small Data Scenarios

no code implementations23 Oct 2020 Julia Siekiera, Marius Köppel, Edwin Simpson, Kevin Stowe, Iryna Gurevych, Stefan Kramer

We therefore adapt the DirectRanker to provide a new deep model for ranking creative language with small data.

Improving Robustness by Augmenting Training Sentences with Predicate-Argument Structures

no code implementations23 Oct 2020 Nafise Sadat Moosavi, Marcel de Boer, Prasetya Ajie Utama, Iryna Gurevych

Existing approaches to improve robustness against dataset biases mostly focus on changing the training objective so that models learn less from biased examples.

Data Augmentation

AdapterDrop: On the Efficiency of Adapters in Transformers

1 code implementation EMNLP 2021 Andreas Rücklé, Gregor Geigle, Max Glockner, Tilman Beck, Jonas Pfeiffer, Nils Reimers, Iryna Gurevych

Massively pre-trained transformer models are computationally expensive to fine-tune, slow for inference, and have large storage requirements.

Why do you think that? Exploring Faithful Sentence-Level Rationales Without Supervision

1 code implementation Findings of the Association for Computational Linguistics 2020 Max Glockner, Ivan Habernal, Iryna Gurevych

We propose a differentiable training-framework to create models which output faithful rationales on a sentence level, by solely applying supervision on the target task.

Decision Making

MultiCQA: Zero-Shot Transfer of Self-Supervised Text Matching Models on a Massive Scale

1 code implementation EMNLP 2020 Andreas Rücklé, Jonas Pfeiffer, Iryna Gurevych

We investigate the model performances on nine benchmarks of answer selection and question similarity tasks, and show that all 140 models transfer surprisingly well, where the large majority of models substantially outperforms common IR baselines.

Answer Selection Community Question Answering +3

Towards Debiasing NLU Models from Unknown Biases

1 code implementation EMNLP 2020 Prasetya Ajie Utama, Nafise Sadat Moosavi, Iryna Gurevych

Recently proposed debiasing methods are shown to be effective in mitigating this tendency.

Predicting the Humorousness of Tweets Using Gaussian Process Preference Learning

1 code implementation3 Aug 2020 Tristan Miller, Erik-Lân Do Dinh, Edwin Simpson, Iryna Gurevych

Most humour processing systems to date make at best discrete, coarse-grained distinctions between the comical and the conventional, yet such notions are better conceptualized as a broad spectrum.

AdapterHub: A Framework for Adapting Transformers

5 code implementations EMNLP 2020 Jonas Pfeiffer, Andreas Rücklé, Clifton Poth, Aishwarya Kamath, Ivan Vulić, Sebastian Ruder, Kyunghyun Cho, Iryna Gurevych

We propose AdapterHub, a framework that allows dynamic "stitching-in" of pre-trained adapters for different tasks and languages.


How to Probe Sentence Embeddings in Low-Resource Languages: On Structural Design Choices for Probing Task Evaluation

1 code implementation CONLL 2020 Steffen Eger, Johannes Daxenberger, Iryna Gurevych

We then probe embeddings in a multilingual setup with design choices that lie in a 'stable region', as we identify for English, and find that results on English do not transfer to other languages.

Sentence Embeddings

Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers

1 code implementation EMNLP (DeeLIO) 2020 Anne Lauscher, Olga Majewska, Leonardo F. R. Ribeiro, Iryna Gurevych, Nikolai Rozanov, Goran Glavaš

Following the major success of neural language models (LMs) such as BERT or GPT-2 on a variety of language understanding tasks, recent work focused on injecting (structured) knowledge from external resources into these models.

Common Sense Reasoning

Empowering Active Learning to Jointly Optimize System and User Demands

1 code implementation ACL 2020 Ji-Ung Lee, Christian M. Meyer, Iryna Gurevych

Existing approaches to active learning maximize the system performance by sampling unlabeled instances for annotation that yield the most efficient training.

Active Learning

AdapterFusion: Non-Destructive Task Composition for Transfer Learning

3 code implementations EACL 2021 Jonas Pfeiffer, Aishwarya Kamath, Andreas Rücklé, Kyunghyun Cho, Iryna Gurevych

We show that by separating the two stages, i. e., knowledge extraction and knowledge composition, the classifier can effectively exploit the representations learned from multiple tasks in a non-destructive manner.

Language Modelling Multi-Task Learning

Low Resource Multi-Task Sequence Tagging -- Revisiting Dynamic Conditional Random Fields

no code implementations1 May 2020 Jonas Pfeiffer, Edwin Simpson, Iryna Gurevych

We compare different models for low resource multi-task sequence tagging that leverage dependencies between label sequences for different tasks.

Multi-Task Learning

Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance

1 code implementation ACL 2020 Prasetya Ajie Utama, Nafise Sadat Moosavi, Iryna Gurevych

Models for natural language understanding (NLU) tasks often rely on the idiosyncratic biases of the dataset, which make them brittle against test cases outside the training distribution.

Natural Language Understanding

A Matter of Framing: The Impact of Linguistic Formalism on Probing Results

no code implementations EMNLP 2020 Ilia Kuznetsov, Iryna Gurevych

Deep pre-trained contextualized encoders like BERT (Delvin et al., 2019) demonstrate remarkable performance on a range of downstream tasks.

Improving Factual Consistency Between a Response and Persona Facts

no code implementations EACL 2021 Mohsen Mesgar, Edwin Simpson, Iryna Gurevych

Neural models for response generation produce responses that are semantically plausible but not necessarily factually consistent with facts describing the speaker's persona.

reinforcement-learning Reinforcement Learning (RL) +1

MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer

3 code implementations EMNLP 2020 Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder

The main goal behind state-of-the-art pre-trained multilingual models such as multilingual BERT and XLM-R is enabling and bootstrapping NLP applications in low-resource languages through zero-shot or few-shot cross-lingual transfer.

Ranked #5 on Cross-Lingual Transfer on XCOPA (using extra training data)

Cross-Lingual Transfer named-entity-recognition +4

Aspect-Controlled Neural Argument Generation

1 code implementation NAACL 2021 Benjamin Schiller, Johannes Daxenberger, Iryna Gurevych

In this work, we train a language model for argument generation that can be controlled on a fine-grained level to generate sentence-level arguments for a given topic, stance, and aspect.

Data Augmentation Language Modelling +1

PuzzLing Machines: A Challenge on Learning From Small Data

no code implementations ACL 2020 Gözde Gül Şahin, Yova Kementchedjhieva, Phillip Rust, Iryna Gurevych

To expose this problem in a new light, we introduce a challenge on learning from small data, PuzzLing Machines, which consists of Rosetta Stone puzzles from Linguistic Olympiads for high school students.

Small Data Image Classification

Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation

10 code implementations EMNLP 2020 Nils Reimers, Iryna Gurevych

The training is based on the idea that a translated sentence should be mapped to the same location in the vector space as the original sentence.

Knowledge Distillation Sentence Embedding +1

Metaphoric Paraphrase Generation

no code implementations28 Feb 2020 Kevin Stowe, Leonardo Ribeiro, Iryna Gurevych

This work describes the task of metaphoric paraphrase generation, in which we are given a literal sentence and are charged with generating a metaphoric paraphrase.

Paraphrase Generation

Latent Normalizing Flows for Many-to-Many Cross-Domain Mappings

1 code implementation ICLR 2020 Shweta Mahajan, Iryna Gurevych, Stefan Roth

Therefore, we propose a novel semi-supervised framework, which models shared information between domains and domain-specific information separately.

Image Captioning Image Generation

Modeling Global and Local Node Contexts for Text Generation from Knowledge Graphs

1 code implementation29 Jan 2020 Leonardo F. R. Ribeiro, Yue Zhang, Claire Gardent, Iryna Gurevych

Recent graph-to-text models generate text from graph-based data using either global or local aggregation to learn node representations.

Graph-to-Sequence KG-to-Text Generation +1

Stance Detection Benchmark: How Robust Is Your Stance Detection?

1 code implementation6 Jan 2020 Benjamin Schiller, Johannes Daxenberger, Iryna Gurevych

Stance Detection (StD) aims to detect an author's stance towards a certain topic or claim and has become a key component in applications like fake news detection, claim validation, and argument search.

Fake News Detection Multi-Task Learning +1

Analyzing Structures in the Semantic Vector Space: A Framework for Decomposing Word Embeddings

1 code implementation17 Dec 2019 Andreas Hanselowski, Iryna Gurevych

Word embeddings are rich word representations, which in combination with deep neural networks, lead to large performance gains for many NLP tasks.

Word Embeddings

Two Birds with One Stone: Investigating Invertible Neural Networks for Inverse Problems in Morphology

no code implementations11 Dec 2019 Gözde Gül Şahin, Iryna Gurevych

We show that they are able to recover the morphological input parameters, i. e., predicting the lemma (e. g., cat) or the morphological tags (e. g., Plural) when run in the reverse direction, without any significant performance drop in the forward direction, i. e., predicting the surface form (e. g., cats).


Scalable Bayesian Preference Learning for Crowds

1 code implementation4 Dec 2019 Edwin Simpson, Iryna Gurevych

As previous solutions based on Gaussian processes do not scale to large numbers of users, items or pairwise labels, we propose a stochastic variational inference approach that limits computational and memory costs.

Gaussian Processes Variational Inference

When is ACL's Deadline? A Scientific Conversational Agent

no code implementations23 Nov 2019 Mohsen Mesgar, Paul Youssef, Lin Li, Dominik Bierwirth, Yihao Li, Christian M. Meyer, Iryna Gurevych

Our conversational agent UKP-ATHENA assists NLP researchers in finding and exploring scientific literature, identifying relevant authors, planning or post-processing conference visits, and preparing paper submissions using a unified interface based on natural language inputs and responses.

Interactive Text Ranking with Bayesian Optimisation: A Case Study on Community QA and Summarisation

1 code implementation22 Nov 2019 Edwin Simpson, Yang Gao, Iryna Gurevych

For many NLP applications, such as question answering and summarisation, the goal is to select the best solution from a large space of candidates to meet a particular user's needs.

Bayesian Optimisation Community Question Answering +1

Neural Duplicate Question Detection without Labeled Training Data

1 code implementation IJCNLP 2019 Andreas Rücklé, Nafise Sadat Moosavi, Iryna Gurevych

We show that our proposed approaches are more effective in many cases because they can utilize larger amounts of unlabeled data from cQA forums.

Answer Selection Community Question Answering +1

Revisiting the Binary Linearization Technique for Surface Realization

no code implementations WS 2019 Yevgeniy Puzikov, Claire Gardent, Ido Dagan, Iryna Gurevych

End-to-end neural approaches have achieved state-of-the-art performance in many natural language processing (NLP) tasks.

Decision Making

Improving Generalization by Incorporating Coverage in Natural Language Inference

no code implementations19 Sep 2019 Nafise Sadat Moosavi, Prasetya Ajie Utama, Andreas Rücklé, Iryna Gurevych

Finally, we show that using the coverage information is not only beneficial for improving the performance across different datasets of the same task.

Natural Language Inference

Joint Wasserstein Autoencoders for Aligning Multimodal Embeddings

no code implementations14 Sep 2019 Shweta Mahajan, Teresa Botschen, Iryna Gurevych, Stefan Roth

One of the key challenges in learning joint embeddings of multiple modalities, e. g. of images and text, is to ensure coherent cross-modal semantics that generalize across datasets.

Cross-Modal Retrieval Retrieval

What do Deep Networks Like to Read?

no code implementations10 Sep 2019 Jonas Pfeiffer, Aishwarya Kamath, Iryna Gurevych, Sebastian Ruder

Recent research towards understanding neural networks probes models in a top-down manner, but is only able to identify model tendencies that are known a priori.

Better Rewards Yield Better Summaries: Learning to Summarise Without References

2 code implementations IJCNLP 2019 Florian Böhm, Yang Gao, Christian M. Meyer, Ori Shapira, Ido Dagan, Iryna Gurevych

Human evaluation experiments show that, compared to the state-of-the-art supervised-learning systems and ROUGE-as-rewards RL summarisation systems, the RL systems using our learned rewards during training generate summarieswith higher human ratings.

Reinforcement Learning (RL)

Enhancing AMR-to-Text Generation with Dual Graph Representations

1 code implementation IJCNLP 2019 Leonardo F. R. Ribeiro, Claire Gardent, Iryna Gurevych

Generating text from graph-based data, such as Abstract Meaning Representation (AMR), is a challenging task due to the inherent difficulty in how to properly encode the structure of a graph with labeled edges.

AMR-to-Text Generation Data-to-Text Generation +1

FAMULUS: Interactive Annotation and Feedback Generation for Teaching Diagnostic Reasoning

no code implementations IJCNLP 2019 Jonas Pfeiffer, Christian M. Meyer, Claudia Schulz, Jan Kiesewetter, Jan Zottmann, Michael Sailer, Elisabeth Bauer, Frank Fischer, Martin R. Fischer, Iryna Gurevych

Our proposed system FAMULUS helps students learn to diagnose based on automatic feedback in virtual patient simulations, and it supports instructors in labeling training data.


Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

53 code implementations IJCNLP 2019 Nils Reimers, Iryna Gurevych

However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10, 000 sentences requires about 50 million inference computations (~65 hours) with BERT.

Linear-Probe Classification Semantic Similarity +4

Dialogue Coherence Assessment Without Explicit Dialogue Act Labels

1 code implementation ACL 2020 Mohsen Mesgar, Sebastian Bücker, Iryna Gurevych

Recent dialogue coherence models use the coherence features designed for monologue texts, e. g. nominal entities, to represent utterances and then explicitly augment them with dialogue-relevant features, e. g., dialogue act labels.

Multi-Task Learning

Reward Learning for Efficient Reinforcement Learning in Extractive Document Summarisation

1 code implementation30 Jul 2019 Yang Gao, Christian M. Meyer, Mohsen Mesgar, Iryna Gurevych

The predominant RL paradigm for summarisation learns a cross-input policy, which requires considerable time, data and parameter tuning due to the huge search spaces and the delayed rewards.

Decision Making Learning-To-Rank +2

Preference-based Interactive Multi-Document Summarisation

1 code implementation7 Jun 2019 Yang Gao, Christian M. Meyer, Iryna Gurevych

Interactive NLP is a promising paradigm to close the gap between automatic NLP systems and the human upper bound.

Active Learning reinforcement-learning +1

Pitfalls in the Evaluation of Sentence Embeddings

no code implementations WS 2019 Steffen Eger, Andreas Rücklé, Iryna Gurevych

Our motivation is to challenge the current evaluation of sentence embeddings and to provide an easy-to-access reference for future research.

Sentence Embeddings

A Streamlined Method for Sourcing Discourse-level Argumentation Annotations from the Crowd

1 code implementation NAACL 2019 Tristan Miller, Maria Sukhareva, Iryna Gurevych

The study of argumentation and the development of argument mining tools depends on the availability of annotated data, which is challenging to obtain in sufficient quantity and quality.

Argument Mining

Fast Concept Mention Grouping for Concept Map-based Multi-Document Summarization

1 code implementation NAACL 2019 Tobias Falke, Iryna Gurevych

Concept map-based multi-document summarization has recently been proposed as a variant of the traditional summarization task with graph-structured summaries.

Document Summarization Multi-Document Summarization

Alternative Weighting Schemes for ELMo Embeddings

1 code implementation5 Apr 2019 Nils Reimers, Iryna Gurevych

We evaluate different methods that combine the three vectors from the language model in order to achieve the best possible performance in downstream NLP tasks.

Language Modelling Word Embeddings

Does My Rebuttal Matter? Insights from a Major NLP Conference

1 code implementation NAACL 2019 Yang Gao, Steffen Eger, Ilia Kuznetsov, Iryna Gurevych, Yusuke Miyao

We then focus on the role of the rebuttal phase, and propose a novel task to predict after-rebuttal (i. e., final) scores from initial reviews and author responses.

Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems

1 code implementation NAACL 2019 Steffen Eger, Gözde Gül Şahin, Andreas Rücklé, Ji-Ung Lee, Claudia Schulz, Mohsen Mesgar, Krishnkant Swarnkar, Edwin Simpson, Iryna Gurevych

Visual modifications to text are often used to obfuscate offensive comments in social media (e. g., "! d10t") or as a writing style ("1337" in "leet speak"), among other scenarios.

Adversarial Attack

Predicting Research Trends From Arxiv

1 code implementation7 Mar 2019 Steffen Eger, Chao Li, Florian Netzer, Iryna Gurevych

By extrapolation, we predict that these topics will remain lead problems/approaches in their fields in the short- and mid-term.

reinforcement-learning Reinforcement Learning (RL) +1

Is it Time to Swish? Comparing Deep Learning Activation Functions Across NLP tasks

1 code implementation EMNLP 2018 Steffen Eger, Paul Youssef, Iryna Gurevych

Activation functions play a crucial role in neural networks because they are the nonlinearities which have been attributed to the success story of deep learning.

Image Classification

Challenges in the Automatic Analysis of Students' Diagnostic Reasoning

1 code implementation26 Nov 2018 Claudia Schulz, Christian M. Meyer, Michael Sailer, Jan Kiesewetter, Elisabeth Bauer, Frank Fischer, Martin R. Fischer, Iryna Gurevych

We aim to enable the large-scale adoption of diagnostic reasoning analysis and feedback by automating the epistemic activity identification.

A Bayesian Approach for Sequence Tagging with Crowds

1 code implementation IJCNLP 2019 Edwin Simpson, Iryna Gurevych

Current methods for sequence tagging, a core task in NLP, are data hungry, which motivates the use of crowdsourcing as a cheap way to obtain labelled data.

Active Learning Argument Mining +3

Frame- and Entity-Based Knowledge for Common-Sense Argumentative Reasoning

1 code implementation WS 2018 Teresa Botschen, Daniil Sorokin, Iryna Gurevych

Common-sense argumentative reasoning is a challenging task that requires holistic understanding of the argumentation where external knowledge about the world is hypothesized to play a key role.

Argument Mining Common Sense Reasoning +7

Corpus-Driven Thematic Hierarchy Induction

no code implementations CONLL 2018 Ilia Kuznetsov, Iryna Gurevych

Thematic role hierarchy is a widely used linguistic tool to describe interactions between semantic roles and their syntactic realizations.

Machine Translation Question Answering +1

UKP-Athene: Multi-Sentence Textual Entailment for Claim Verification

1 code implementation WS 2018 Andreas Hanselowski, Hao Zhang, Zile Li, Daniil Sorokin, Benjamin Schiller, Claudia Schulz, Iryna Gurevych

The Fact Extraction and VERification (FEVER) shared task was launched to support the development of systems able to verify claims by extracting supporting or refuting facts from raw text.

Claim Verification Entity Linking +3

From Text to Lexicon: Bridging the Gap between Word Embeddings and Lexical Resources

1 code implementation COLING 2018 Ilia Kuznetsov, Iryna Gurevych

We examine the effect of lemmatization and POS typing on word embedding performance in a novel resource-based evaluation scenario, as well as on standard similarity benchmarks.

Coreference Resolution Lemmatization +2

One Size Fits All? A simple LSTM for non-literal token and construction-level classification

no code implementations COLING 2018 Erik-L{\^a}n Do Dinh, Steffen Eger, Iryna Gurevych

In this paper, we tackle four different tasks of non-literal language classification: token and construction level metaphor detection, classification of idiomatic use of infinitive-verb compounds, and classification of non-literal particle verbs.

Classification General Classification +1

The INCEpTION Platform: Machine-Assisted and Knowledge-Oriented Interactive Annotation

1 code implementation COLING 2018 Jan-Christoph Klie, Michael Bugert, Beto Boullosa, Richard Eckart de Castilho, Iryna Gurevych

We introduce INCEpTION, a new annotation platform for tasks including interactive and semantic annotation (e. g., concept linking, fact linking, knowledge base population, semantic frame annotation).

Active Learning Entity Linking +2

A Retrospective Analysis of the Fake News Challenge Stance-Detection Task

1 code implementation COLING 2018 Andreas Hanselowski, Avinesh PVS, Benjamin Schiller, Felix Caspelherr, Debanjan Chaudhuri, Christian M. Meyer, Iryna Gurevych

To date, there is no in-depth analysis paper to critically discuss FNC-1{'}s experimental setup, reproduce the results, and draw conclusions for next-generation stance classification methods.

General Classification Stance Classification +1