1 code implementation • ACL 2022 • Kevin Stowe, Prasetya Utama, Iryna Gurevych
Natural language inference (NLI) has been widely used as a task to train and evaluate models for language understanding.
1 code implementation • Findings (ACL) 2022 • Andreas Waldis, Tilman Beck, Iryna Gurevych
Identifying the relation between two sentences requires datasets with pairwise annotations.
no code implementations • NAACL (DaSH) 2021 • Jan-Christoph Klie, Richard Eckart de Castilho, Iryna Gurevych
Entity linking (EL) is concerned with disambiguating entity mentions in a text against knowledge bases (KB).
1 code implementation • COLING (CRAC) 2022 • Haixia Chai, Nafise Sadat Moosavi, Iryna Gurevych, Michael Strube
The results of our extrinsic evaluation show that while there is a significant difference between the performance of the rule-based system vs. state-of-the-art neural model on coreference resolution datasets, we do not observe a considerable difference on their impact on downstream models.
1 code implementation • Findings (EMNLP) 2021 • Kexin Wang, Nils Reimers, Iryna Gurevych
Learning sentence embeddings often requires a large amount of labeled data.
1 code implementation • Findings (EMNLP) 2021 • Mohsen Mesgar, Leonardo F. R. Ribeiro, Iryna Gurevych
Entity grids and entity graphs are two frameworks for modeling local coherence.
1 code implementation • CoNLL (EMNLP) 2021 • Kevin Stowe, Nils Beck, Iryna Gurevych
Metaphor generation is a difficult task, and has seen tremendous improvement with the advent of deep pretrained models.
1 code implementation • AKBC 2021 • Michael Bugert, Iryna Gurevych
Cross-document event coreference resolution (CDCR) is the task of identifying which event mentions refer to the same events throughout a collection of documents.
1 code implementation • 17 Oct 2024 • Zhuohan Xie, Rui Xing, Yuxia Wang, Jiahui Geng, Hasan Iqbal, Dhruv Sahnan, Iryna Gurevych, Preslav Nakov
The typical approach to fact-checking these atomic claims involves retrieving a fixed number of pieces of evidence, followed by a verification step.
1 code implementation • 2 Oct 2024 • Qian Ruan, Ilia Kuznetsov, Iryna Gurevych
We instantiate this framework in edit intent classification (EIC), a challenging and underexplored classification task.
1 code implementation • 29 Sep 2024 • Aniket Pramanick, Yufang Hou, Saif M. Mohammad, Iryna Gurevych
Large Language Models (LLMs) have ushered in a transformative era in Natural Language Processing (NLP), reshaping research and extending NLP's influence to other fields of study.
1 code implementation • 29 Sep 2024 • Aniket Pramanick, Yufang Hou, Saif M. Mohammad, Iryna Gurevych
In this work, we quantitatively investigate what constitutes NLP by examining research papers.
no code implementations • 26 Sep 2024 • Andreas Waldis, Joel Birrer, Anne Lauscher, Iryna Gurevych
Gender-fair language, an evolving German linguistic variation, fosters inclusion by addressing all genders or using neutral forms.
1 code implementation • 19 Sep 2024 • Furkan Şahinuç, Thy Thy Tran, Yulia Grishina, Yufang Hou, Bei Chen, Iryna Gurevych
Building on this dataset, we propose three experimental settings that simulate real-world scenarios where TDM triples are fully defined, partially defined, or undefined during leaderboard construction.
3 code implementations • 9 Sep 2024 • Tuba Gokhan, Kexin Wang, Iryna Gurevych, Ted Briscoe
Regulatory documents, issued by governmental regulatory bodies, establish rules, guidelines, and standards that organizations must adhere to for legal compliance.
no code implementations • 9 Sep 2024 • Nils Dycke, Matej Zečević, Ilia Kuznetsov, Beatrix Suess, Kristian Kersting, Iryna Gurevych
To close this gap, we investigate diagnostic abductive reasoning (DAR) in the context of language-grounded tasks (NL-DAR).
1 code implementation • 25 Aug 2024 • Haau-Sing Li, Patrick Fernandes, Iryna Gurevych, André F. T. Martins
Recently, a diverse set of decoding and reranking procedures have been shown effective for LLM-based code generation.
1 code implementation • 23 Aug 2024 • Max Glockner, Yufang Hou, Preslav Nakov, Iryna Gurevych
Health-related misinformation claims often falsely cite a credible biomedical publication as evidence, which superficially appears to support the false claim.
1 code implementation • 19 Aug 2024 • Jonathan Tonglet, Marie-Francine Moens, Iryna Gurevych
By explaining what is actually true about the image, fact-checkers can better detect misinformation, focus their efforts on check-worthy visual content, engage in counter-messaging before misinformation spreads widely, and make their explanation more convincing.
no code implementations • 14 Aug 2024 • Subhabrata Dutta, Timo Kaufmann, Goran Glavaš, Ivan Habernal, Kristian Kersting, Frauke Kreuter, Mira Mezini, Iryna Gurevych, Eyke Hüllermeier, Hinrich Schuetze
While there is a widespread belief that artificial general intelligence (AGI) -- or even superhuman AI -- is imminent, complex problems in expert domains are far from being solved.
1 code implementation • 8 Aug 2024 • Mervat Abassy, Kareem Elozeiri, Alexander Aziz, Minh Ngoc Ta, Raj Vardhan Tomar, Bimarsha Adhikari, Saad El Dine Ahmed, Yuxia Wang, Osama Mohammed Afzal, Zhuohan Xie, Jonibek Mansurov, Ekaterina Artemova, Vladislav Mikhailov, Rui Xing, Jiahui Geng, Hasan Iqbal, Zain Muhammad Mujahid, Tarek Mahmoud, Akim Tsvigun, Alham Fikri Aji, Artem Shelmanov, Nizar Habash, Iryna Gurevych, Preslav Nakov
Category (iii) aims to detect attempts to obfuscate the fact that a text was machine-generated, while category (iv) looks for cases where the LLM was used to polish a human-written text, which is typically acceptable in academic writing, but not in education.
2 code implementations • 6 Aug 2024 • Hasan Iqbal, Yuxia Wang, Minghan Wang, Georgi Georgiev, Jiahui Geng, Iryna Gurevych, Preslav Nakov
The increased use of large language models (LLMs) across a variety of real-world applications calls for automatic tools to check the factual accuracy of their outputs, as LLMs often hallucinate.
1 code implementation • 31 Jul 2024 • Yufang Hou, Thy Thy Tran, Doan Nam Long Vu, Yiwen Cao, Kai Li, Lukas Rohde, Iryna Gurevych
This paper presents a shared task that we organized at the Foundations of Language Technology (FoLT) course in 2023/2024 at the Technical University of Darmstadt, which focuses on evaluating the output of Large Language Models (LLMs) in generating harmful answers to health-related clinical questions.
1 code implementation • 29 Jul 2024 • Neele Falk, Andreas Waldis, Iryna Gurevych
Argument retrieval is the task of finding relevant arguments for a given query.
1 code implementation • 20 Jul 2024 • Yongxin Huang, Kexin Wang, Goran Glavaš, Iryna Gurevych
Another limitation of multilingual sentence encoders is the trade-off between monolingual and cross-lingual performance.
no code implementations • 17 Jul 2024 • Fengyu Cai, Xinran Zhao, Hongming Zhang, Iryna Gurevych, Heinz Koeppl
Recent advances in measuring hardness-wise properties of data guide language models in sample selection within low-resource scenarios.
1 code implementation • 16 Jul 2024 • Rachneet Sachdeva, Yixiao Song, Mohit Iyyer, Iryna Gurevych
This work introduces HaluQuestQA, the first hallucination dataset with localized error annotations for human-written and model-generated LFQA answers.
1 code implementation • 16 Jul 2024 • Tianyu Yang, Xiaodan Zhu, Iryna Gurevych
Text anonymization is crucial for sharing sensitive data while maintaining privacy.
no code implementations • 16 Jul 2024 • Haishuo Fang, Xiaodan Zhu, Iryna Gurevych
A crucial requirement for deploying LLM-based agents in real-life applications is the robustness against risky or even irreversible mistakes.
1 code implementation • 15 Jul 2024 • Fengyu Cai, Xinran Zhao, Tong Chen, Sihao Chen, Hongming Zhang, Iryna Gurevych, Heinz Koeppl
Recent studies show the growing significance of document retrieval in the generation of LLMs, i. e., RAG, within the scientific domain by bridging their knowledge gap.
1 code implementation • 12 Jul 2024 • Nico Daheim, Jakub Macina, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan
We show empirically that finding the mistake in a student solution is challenging for current models.
no code implementations • 11 Jul 2024 • Haoyu He, Markus Flicke, Jan Buchmann, Iryna Gurevych, Andreas Geiger
We address the technical challenge of implementing HDT's sample-dependent hierarchical attention pattern by developing a novel sparse attention kernel that considers the hierarchical structure of documents.
1 code implementation • 10 Jul 2024 • Jan Buchmann, Xiao Liu, Iryna Gurevych
This is crucially different from the long document setting, where retrieval is not needed, but could help.
1 code implementation • 4 Jul 2024 • Furkan Şahinuç, Ilia Kuznetsov, Yufang Hou, Iryna Gurevych
Large language models (LLMs) bring unprecedented flexibility in defining and executing complex, creative natural language generation (NLG) tasks.
1 code implementation • 4 Jul 2024 • Hovhannes Tamoyan, Hendrik Schuff, Iryna Gurevych
The development of chatbots requires collecting a large number of human-chatbot dialogues to reflect the breadth of users' sociodemographic backgrounds and conversational goals.
1 code implementation • 3 Jul 2024 • Haritz Puerto, Tilek Chubakov, Xiaodan Zhu, Harish Tayyar Madabushi, Iryna Gurevych
In fact, it has been found that instruction tuning on these intermediary reasoning steps improves model performance.
1 code implementation • 1 Jul 2024 • Leon Engländer, Hannah Sterz, Clifton Poth, Jonas Pfeiffer, Ilia Kuznetsov, Iryna Gurevych
While adapting NLP models to new languages within a single domain, or to new domains within a single language, is widely studied, research in joint adaptation is hampered by the lack of evaluation datasets.
1 code implementation • 11 Jun 2024 • Haishuo Fang, Xiaodan Zhu, Iryna Gurevych
Answering Questions over Knowledge Graphs (KGQA) is key to well-functioning autonomous language agents in various real-life applications.
1 code implementation • 7 Jun 2024 • Md Imbesat Hassan Rizvi, Xiaodan Zhu, Iryna Gurevych
In this work, we present a comprehensive study of the capability of current state-of-the-art large language models (LLMs) on spatial reasoning.
no code implementations • 6 Jun 2024 • Chen Cecilia Liu, Iryna Gurevych, Anna Korhonen
The surge of interest in culturally aware and adapted Natural Language Processing (NLP) has inspired much recent research.
1 code implementation • 5 Jun 2024 • Max Glockner, Yufang Hou, Preslav Nakov, Iryna Gurevych
Unlike previous fallacy detection datasets, Missci (i) focuses on implicit fallacies between the relevant content of the cited publication and the inaccurate claim, and (ii) requires models to verbalize the fallacious reasoning in addition to classifying it.
no code implementations • 31 May 2024 • Qian Ruan, Ilia Kuznetsov, Iryna Gurevych
Collaborative review and revision of textual documents is the core of knowledge work and a promising target for empirical analysis and NLP assistance.
1 code implementation • 10 May 2024 • Ilia Kuznetsov, Osama Mohammed Afzal, Koen Dercksen, Nils Dycke, Alexander Goldberg, Tom Hope, Dirk Hovy, Jonathan K. Kummerfeld, Anne Lauscher, Kevin Leyton-Brown, Sheng Lu, Mausam, Margot Mieskes, Aurélie Névéol, Danish Pruthi, Lizhen Qu, Roy Schwartz, Noah A. Smith, Thamar Solorio, Jingyan Wang, Xiaodan Zhu, Anna Rogers, Nihar B. Shah, Iryna Gurevych
We hope that our work will help set the agenda for research in machine-assisted scientific quality control in the age of AI, within the NLP community and beyond.
no code implementations • 29 Apr 2024 • Andreas Waldis, Yotam Perlitz, Leshem Choshen, Yufang Hou, Iryna Gurevych
We introduce Holmes, a new benchmark designed to assess language models (LMs) linguistic competence - their unconscious understanding of linguistic phenomena.
1 code implementation • 22 Apr 2024 • Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Osama Mohammed Afzal, Tarek Mahmoud, Giovanni Puccetti, Thomas Arnold, Chenxi Whitehouse, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov
The task attracted a large number of participants: subtask A monolingual (126), subtask A multilingual (59), subtask B (70), and subtask C (30).
no code implementations • 19 Apr 2024 • Ahmed Elshabrawy, Yongxin Huang, Iryna Gurevych, Alham Fikri Aji
While Large Language Models (LLMs) exhibit remarkable capabilities in zero-shot and few-shot scenarios, they often require computationally prohibitive sizes.
1 code implementation • 12 Apr 2024 • Ji-Ung Lee, Marc E. Pfetsch, Iryna Gurevych
This work proposes a novel method to generate C-Tests; a deviated form of cloze tests (a gap filling exercise) where only the last part of a word is turned into a gap.
no code implementations • 22 Mar 2024 • Chen Cecilia Liu, Iryna Gurevych
Prior research has found that differences in the early period of neural network training significantly impact the performance of in-distribution (ID) tasks.
1 code implementation • 6 Mar 2024 • Indraneil Paul, Goran Glavaš, Iryna Gurevych
In particular, most mainstream Code-LMs have been pre-trained on source code files alone.
no code implementations • 6 Mar 2024 • Jiahui Geng, Yova Kementchedjhieva, Preslav Nakov, Iryna Gurevych
To the best of our knowledge, we are the first to evaluate MLLMs for real-world fact-checking.
no code implementations • 5 Mar 2024 • Anmol Goel, Nico Daheim, Iryna Gurevych
In this work, we address this gap by augmenting open-source datasets for positive text rewriting with synthetically-generated Socratic rationales using a novel framework called \textsc{SocraticReframe}.
1 code implementation • 27 Feb 2024 • Yuesong Shen, Nico Daheim, Bai Cong, Peter Nickl, Gian Maria Marconi, Clement Bazan, Rio Yokota, Iryna Gurevych, Daniel Cremers, Mohammad Emtiyaz Khan, Thomas Möllenhoff
We give extensive empirical evidence against the common belief that variational learning is ineffective for large neural networks.
1 code implementation • 19 Feb 2024 • Justus-Jonas Erker, Florian Mai, Nils Reimers, Gerasimos Spanakis, Iryna Gurevych
Search-based dialog models typically re-encode the dialog history at every turn, incurring high cost.
1 code implementation • 17 Feb 2024 • Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Osama Mohanned Afzal, Tarek Mahmoud, Giovanni Puccetti, Thomas Arnold, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov
The advent of Large Language Models (LLMs) has brought an unprecedented surge in machine-generated text (MGT) across diverse channels.
no code implementations • 3 Feb 2024 • Fajri Koto, Tilman Beck, Zeerak Talat, Iryna Gurevych, Timothy Baldwin
Improving multilingual language models capabilities in low-resource languages is generally difficult due to the scarcity of large-scale data in those languages.
1 code implementation • 2 Feb 2024 • Andreas Waldis, Yufang Hou, Iryna Gurevych
Pre-trained language models (LMs) perform well in In-Topic setups, where training and testing data come from the same topics.
no code implementations • 31 Jan 2024 • Jan Buchmann, Max Eichler, Jan-Micha Bodensohn, Ilia Kuznetsov, Iryna Gurevych
Long documents often exhibit structure with hierarchically organized elements of different functions, such as section headers and paragraphs.
1 code implementation • 18 Jan 2024 • Haritz Puerto, Martin Tutek, Somak Aditya, Xiaodan Zhu, Iryna Gurevych
Reasoning is a fundamental component of language understanding.
1 code implementation • 17 Jan 2024 • Dominic Petrak, Thy Thy Tran, Iryna Gurevych
Implicit user feedback, user emotions and demographic information have shown to be promising sources for improving the accuracy and user engagement of responses generated by dialogue systems.
1 code implementation • 18 Nov 2023 • Clifton Poth, Hannah Sterz, Indraneil Paul, Sukannya Purkayastha, Leon Engländer, Timo Imhof, Ivan Vulić, Sebastian Ruder, Iryna Gurevych, Jonas Pfeiffer
We introduce Adapters, an open-source library that unifies parameter-efficient and modular transfer learning in large language models.
2 code implementations • 15 Nov 2023 • Yuxia Wang, Revanth Gangi Reddy, Zain Muhammad Mujahid, Arnav Arora, Aleksandr Rubashevskii, Jiahui Geng, Osama Mohammed Afzal, Liangming Pan, Nadav Borenstein, Aditya Pillai, Isabelle Augenstein, Iryna Gurevych, Preslav Nakov
The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the factual accuracy of their outputs.
no code implementations • 14 Nov 2023 • Jiahui Geng, Fengyu Cai, Yuxia Wang, Heinz Koeppl, Preslav Nakov, Iryna Gurevych
Assessing their confidence and calibrating them across different tasks can help mitigate risks and enable LLMs to produce better generations.
1 code implementation • 13 Nov 2023 • Sheng Lu, Hendrik Schuff, Iryna Gurevych
In-context learning (ICL) has become one of the most popular learning paradigms.
1 code implementation • 11 Nov 2023 • Luke Bates, Peter Ebert Christensen, Preslav Nakov, Iryna Gurevych
Here, to aid understanding of memes, we release a knowledge base of memes and information found on www. knowyourmeme. com, which we call the Know Your Meme Knowledge Base (KYMKB), composed of more than 54, 000 images.
1 code implementation • 7 Nov 2023 • Sukannya Purkayastha, Anne Lauscher, Iryna Gurevych
In this work, we are the first to explore Jiu-Jitsu argumentation for peer review by proposing the novel task of attitude and theme-guided rebuttal generation.
1 code implementation • 1 Nov 2023 • Yongxin Huang, Kexin Wang, Sourav Dutta, Raj Nath Patel, Goran Glavaš, Iryna Gurevych
As a solution, we propose AdaSent, which decouples SEPT from DAPT by training a SEPT adapter on the base PLM.
1 code implementation • 24 Oct 2023 • Dominic Petrak, Nafise Sadat Moosavi, Ye Tian, Nikolai Rozanov, Iryna Gurevych
Learning from free-text human feedback is essential for dialog systems, but annotated data is scarce and usually covers only a small fraction of error types known in conversational AI.
1 code implementation • 19 Oct 2023 • Nico Daheim, Thomas Möllenhoff, Edoardo Maria Ponti, Iryna Gurevych, Mohammad Emtiyaz Khan
Models trained on different datasets can be merged by a weighted-averaging of their parameters, but why does it work and when can it fail?
1 code implementation • 18 Oct 2023 • Sheng Lu, Shan Chen, Yingya Li, Danielle Bitterman, Guergana Savova, Iryna Gurevych
In-context learning (ICL) is a new learning paradigm that has gained popularity along with the development of large language models.
1 code implementation • 15 Sep 2023 • Chen Cecilia Liu, Fajri Koto, Timothy Baldwin, Iryna Gurevych
Large language models (LLMs) are highly adept at question answering and reasoning tasks, but when reasoning in a situational context, human expectations vary depending on the relevant cultural common ground.
1 code implementation • 15 Sep 2023 • Andreas Waldis, Yufang Hou, Iryna Gurevych
Our findings challenge the previously asserted general superiority of in-context learning (ICL) for OOD.
1 code implementation • 14 Sep 2023 • Rachneet Sachdeva, Martin Tutek, Iryna Gurevych
In recent years, large language models (LLMs) have shown remarkable capabilities at scale, particularly at generating text conditioned on a prompt.
1 code implementation • 13 Sep 2023 • Tilman Beck, Hendrik Schuff, Anne Lauscher, Iryna Gurevych
However, the available NLP literature disagrees on the efficacy of this technique - it remains unclear for which tasks and scenarios it can help, and the role of the individual factors in sociodemographic prompting is still unexplored.
1 code implementation • 4 Sep 2023 • Sheng Lu, Irina Bigoulaeva, Rachneet Sachdeva, Harish Tayyar Madabushi, Iryna Gurevych
Large language models, comprising billions of parameters and pre-trained on extensive web-scale corpora, have been claimed to acquire certain capabilities without having been specifically trained on them.
1 code implementation • 19 Jul 2023 • Nandan Thakur, Kexin Wang, Iryna Gurevych, Jimmy Lin
In this work, we provide SPRINT, a unified Python toolkit based on Pyserini and Lucene, supporting a common interface for evaluating neural sparse retrieval.
2 code implementations • 16 Jul 2023 • Jan-Christoph Klie, Richard Eckart de Castilho, Iryna Gurevych
A majority of the annotated publications apply good or excellent quality management.
no code implementations • 29 Jun 2023 • Ji-Ung Lee, Haritz Puerto, Betty van Aken, Yuki Arase, Jessica Zosa Forde, Leon Derczynski, Andreas Rücklé, Iryna Gurevych, Roy Schwartz, Emma Strubell, Jesse Dodge
Many recent improvements in NLP stem from the development and use of large pre-trained language models (PLMs) with billions of parameters.
1 code implementation • 31 May 2023 • Haishuo Fang, Haritz Puerto, Iryna Gurevych
To evaluate the effectiveness of UKP-SQuARE in teaching scenarios, we adopted it in a postgraduate NLP course and surveyed the students after the course.
1 code implementation • 24 May 2023 • Tianyu Yang, Thy Thy Tran, Iryna Gurevych
These models also suffer from posterior collapse, i. e., the decoder tends to ignore latent variables and directly access information captured in the encoder through the cross-attention mechanism.
2 code implementations • 24 May 2023 • Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Chenxi Whitehouse, Osama Mohammed Afzal, Tarek Mahmoud, Toru Sasaki, Thomas Arnold, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov
These results show that the problem is far from solved and that there is a lot of room for improvement.
3 code implementations • 23 May 2023 • Kexin Wang, Nils Reimers, Iryna Gurevych
This drives us to build a benchmark for this task including multiple datasets from heterogeneous domains.
1 code implementation • 23 May 2023 • Jakub Macina, Nico Daheim, Sankalan Pal Chowdhury, Tanmay Sinha, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan
While automatic dialogue tutors hold great potential in making education personalized and more accessible, research on such systems has been hampered by a lack of sufficiently large and high-quality datasets.
no code implementations • 22 May 2023 • Aniket Pramanick, Yufang Hou, Saif M. Mohammad, Iryna Gurevych
In this study, we propose a systematic framework for analyzing the evolution of research topics in a scientific field using causal discovery and inference techniques.
no code implementations • 12 May 2023 • Georgia Chalvatzaki, Ali Younes, Daljeet Nandha, An Le, Leonardo F. R. Ribeiro, Iryna Gurevych
Long-horizon task planning is essential for the development of intelligent assistive and service robots.
1 code implementation • 25 Apr 2023 • Jan-Christoph Klie, Ji-Ung Lee, Kevin Stowe, Gözde Gül Şahin, Nafise Sadat Moosavi, Luke Bates, Dominic Petrak, Richard Eckart de Castilho, Iryna Gurevych
Citizen Science is an alternative to crowdsourcing that is relatively unexplored in the context of NLP.
no code implementations • 18 Apr 2023 • Sukannya Purkayastha, Sebastian Ruder, Jonas Pfeiffer, Iryna Gurevych, Ivan Vulić
In order to boost the capacity of mPLMs to deal with low-resource and unseen languages, we explore the potential of leveraging transliteration on a massive scale.
1 code implementation • 31 Mar 2023 • Haritz Puerto, Tim Baumgärtner, Rachneet Sachdeva, Haishuo Fang, Hao Zhang, Sewin Tariverdian, Kexin Wang, Iryna Gurevych
To ease research in multi-agent models, we extend UKP-SQuARE, an online platform for QA research, to support three families of multi-agent systems: i) agent selection, ii) early-fusion of agents, and iii) late-fusion of agents.
1 code implementation • 30 Mar 2023 • Nico Daheim, Nouha Dziri, Mrinmaya Sachan, Iryna Gurevych, Edoardo M. Ponti
We evaluate our method -- using different variants of Flan-T5 as a backbone language model -- on multiple datasets for information-seeking dialogue generation and compare our method with state-of-the-art techniques for faithfulness, such as CTRL, Quark, DExperts, and Noisy Channel reranking.
3 code implementations • 13 Mar 2023 • Ulf A. Hamster, Ji-Ung Lee, Alexander Geyken, Iryna Gurevych
Training and inference on edge devices often requires an efficient setup due to computational limitations.
1 code implementation • 24 Feb 2023 • Dennis Zyska, Nils Dycke, Jan Buchmann, Ilia Kuznetsov, Iryna Gurevych
Recent years have seen impressive progress in AI-assisted writing, yet the developments in AI-assisted reading are lacking.
1 code implementation • 17 Feb 2023 • Luke Bates, Iryna Gurevych
Few-shot text classification systems have impressive capabilities but are infeasible to deploy and use reliably due to their dependence on prompting and billion-parameter language models.
1 code implementation • 24 Jan 2023 • Jakub Macina, Nico Daheim, Lingzhi Wang, Tanmay Sinha, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan
Designing dialog tutors has been challenging as it involves modeling the diverse and complex pedagogical strategies employed by human tutors.
1 code implementation • 13 Jan 2023 • Chen Cecilia Liu, Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych
Our experiments reveal that scheduled unfreezing induces different learning dynamics compared to standard fine-tuning, and provide evidence that the dynamics of Fisher Information during training correlate with cross-lingual generalization performance.
1 code implementation • 19 Dec 2022 • Haau-Sing Li, Mohsen Mesgar, André F. T. Martins, Iryna Gurevych
We hypothesize that the under-specification of a natural language description can be resolved by asking clarification questions.
1 code implementation • 19 Dec 2022 • Martin Funkquist, Ilia Kuznetsov, Yufang Hou, Iryna Gurevych
To address this challenge, we propose CiteBench: a benchmark for citation text generation that unifies multiple diverse datasets and enables standardized evaluation of citation text generation models across task designs and domains.
no code implementations • 22 Nov 2022 • Neha Warikoo, Tobias Mayer, Dana Atzil-Slonim, Amir Eliassaf, Shira Haimovitz, Iryna Gurevych
No study has examined EC between the subjective experience of emotions and emotion expression in therapy or whether this coherence is associated with clients' well being.
no code implementations • 22 Nov 2022 • Tobias Mayer, Neha Warikoo, Oliver Grimm, Andreas Reif, Iryna Gurevych
While these conversations are part of the daily routine of clinicians, gathering them is usually hindered by various ethical (purpose of data usage), legal (data privacy) and technical (data formatting) limitations.
no code implementations • 14 Nov 2022 • Anxo Pérez, Neha Warikoo, Kexin Wang, Javier Parapar, Iryna Gurevych
Depressive disorders constitute a severe public health issue worldwide.
1 code implementation • 12 Nov 2022 • Nils Dycke, Ilia Kuznetsov, Iryna Gurevych
Peer review constitutes a core component of scholarly publishing; yet it demands substantial expertise and training, and is susceptible to errors and biases.
no code implementations • 10 Nov 2022 • Ilia Kuznetsov, Iryna Gurevych
Natural language processing (NLP) researchers develop models of grammar, meaning and communication based on written text.
1 code implementation • 3 Nov 2022 • Tilman Beck, Andreas Waldis, Iryna Gurevych
Stance detection deals with identifying an author's stance towards a target.
1 code implementation • 31 Oct 2022 • Irina Bigoulaeva, Rachneet Sachdeva, Harish Tayyar Madabushi, Aline Villavicencio, Iryna Gurevych
We compare sequential fine-tuning with a model for multi-task learning in the context where we are interested in boosting performance on two tasks, one of which depends on the other.
1 code implementation • 25 Oct 2022 • Max Glockner, Yufang Hou, Iryna Gurevych
In our analysis, we show that, by design, existing NLP task definitions for fact-checking cannot refute misinformation as professional fact-checkers do for the majority of claims.
1 code implementation • 19 Oct 2022 • Tim Baumgärtner, Leonardo F. R. Ribeiro, Nils Reimers, Iryna Gurevych
Pairing a lexical retriever with a neural re-ranking model has set state-of-the-art performance on large-scale information retrieval datasets.
no code implementations • 12 Oct 2022 • Gregor Geigle, Chen Cecilia Liu, Jonas Pfeiffer, Iryna Gurevych
While many VEs -- of different architectures, trained on different data and objectives -- are publicly available, they are not designed for the downstream V+L tasks.
no code implementations • 12 Oct 2022 • Mohsen Mesgar, Thy Thy Tran, Goran Glavas, Iryna Gurevych
First, the unexplored combination of the cross-encoder architecture (with parameterized similarity scoring function) and episodic meta-learning consistently yields the best FSIC performance.
no code implementations • 31 Aug 2022 • Marcos Treviso, Ji-Ung Lee, Tianchu Ji, Betty van Aken, Qingqing Cao, Manuel R. Ciosici, Michael Hassid, Kenneth Heafield, Sara Hooker, Colin Raffel, Pedro H. Martins, André F. T. Martins, Jessica Zosa Forde, Peter Milder, Edwin Simpson, Noam Slonim, Jesse Dodge, Emma Strubell, Niranjan Balasubramanian, Leon Derczynski, Iryna Gurevych, Roy Schwartz
Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource consumption also grows.
2 code implementations • 30 Aug 2022 • Haishuo Fang, Ji-Ung Lee, Nafise Sadat Moosavi, Iryna Gurevych
In contrast to conventional, predefined activation functions, RAFs can adaptively learn optimal activation functions during training according to input data.
1 code implementation • 19 Aug 2022 • Rachneet Sachdeva, Haritz Puerto, Tim Baumgärtner, Sewin Tariverdian, Hao Zhang, Kexin Wang, Hossain Shaikh Saadi, Leonardo F. R. Ribeiro, Iryna Gurevych
In this paper, we introduce SQuARE v2, the new version of SQuARE, to provide an explainability infrastructure for comparing models based on methods such as saliency maps and graph-based explanations.
1 code implementation • 16 Aug 2022 • Lorenz Stangier, Ji-Ung Lee, Yuxi Wang, Marvin Müller, Nicholas Frick, Joachim Metternich, Iryna Gurevych
We evaluate TexPrax in a user-study with German factory employees who ask their colleagues for solutions on problems that arise during their daily work.
1 code implementation • 12 Aug 2022 • Ivan Habernal, Daniel Faber, Nicola Recchia, Sebastian Bretthauer, Iryna Gurevych, Indra Spiecker genannt Döhmann, Christoph Burchard
Identifying, classifying, and analyzing arguments in legal discourse has been a prominent area of research since the inception of the argument mining field.
1 code implementation • 5 Jun 2022 • Jan-Christoph Klie, Bonnie Webber, Iryna Gurevych
While researchers show that their approaches work well on their newly introduced datasets, they rarely compare their methods to previous work or on the same datasets.
1 code implementation • 23 May 2022 • Benjamin Schiller, Johannes Daxenberger, Andreas Waldis, Iryna Gurevych
The task of Argument Mining, that is extracting and classifying argument components for a specific topic from large document sources, is an inherently difficult task for machine learning models and humans alike, as large Argument Mining datasets are rare and recognition of argument components requires expert knowledge.
2 code implementations • 13 May 2022 • Dominic Petrak, Nafise Sadat Moosavi, Iryna Gurevych
In this paper, we propose a new extended pretraining approach called Arithmetic-Based Pretraining that jointly addresses both in one extended pretraining step without requiring architectural changes or pretraining from scratch.
1 code implementation • NAACL 2022 • Prasetya Ajie Utama, Joshua Bambrick, Nafise Sadat Moosavi, Iryna Gurevych
In this work, we show that NLI models can be effective for this task when the training data is augmented with high-quality task-oriented examples.
Abstractive Text Summarization Natural Language Inference +1
1 code implementation • NAACL 2022 • Nafise Sadat Moosavi, Quentin Delfosse, Kristian Kersting, Iryna Gurevych
The resulting adapters (a) contain about 50% of the learning parameters of the standard adapter and are therefore more efficient at training and inference, and require less storage space, and (b) achieve considerably higher performances in low-data settings.
1 code implementation • 22 Apr 2022 • Ilia Kuznetsov, Jan Buchmann, Max Eichler, Iryna Gurevych
While existing NLP studies focus on the analysis of individual texts, editorial assistance often requires modeling interactions between pairs of texts -- yet general frameworks and datasets to support this scenario are missing.
3 code implementations • NAACL 2022 • Leonardo F. R. Ribeiro, Mengwen Liu, Iryna Gurevych, Markus Dreyer, Mohit Bansal
Despite recent improvements in abstractive summarization, most current approaches generate summaries that are not factually consistent with the source document, severely restricting their trust and usage in real-world applications.
1 code implementation • ACL 2022 • Tim Baumgärtner, Kexin Wang, Rachneet Sachdeva, Max Eichler, Gregor Geigle, Clifton Poth, Hannah Sterz, Haritz Puerto, Leonardo F. R. Ribeiro, Jonas Pfeiffer, Nils Reimers, Gözde Gül Şahin, Iryna Gurevych
Recent advances in NLP and information retrieval have given rise to a diverse set of question answering tasks that are of different formats (e. g., extractive, abstractive), require different model architectures (e. g., generative, discriminative), and setups (e. g., with or without retrieval).
1 code implementation • 15 Feb 2022 • Chen Liu, Jonas Pfeiffer, Anna Korhonen, Ivan Vulić, Iryna Gurevych
2) We analyze cross-lingual VQA across different question types of varying complexity for different multilingual multimodal Transformers, and identify question types that are the most difficult to improve on.
2 code implementations • 14 Feb 2022 • Federico Ruggeri, Mohsen Mesgar, Iryna Gurevych
The applications of conversational agents for scientific disciplines (as expert domains) are understudied due to the lack of dialogue data to train such agents.
Ranked #1 on Fact Selection on ArgSciChat
1 code implementation • 27 Jan 2022 • Nils Dycke, Ilia Kuznetsov, Iryna Gurevych
The shift towards publicly available text sources has enabled language processing at unprecedented scale, yet leaves under-serviced the domains where public and openly licensed data is scarce.
no code implementations • 15 Jan 2022 • Irina Bigoulaeva, Viktor Hangya, Iryna Gurevych, Alexander Fraser
The goal of hate speech detection is to filter negative online content aiming at certain groups of people.
5 code implementations • NAACL 2022 • Kexin Wang, Nandan Thakur, Nils Reimers, Iryna Gurevych
This limits the usage of dense retrieval approaches to only a few domains with large training datasets.
Ranked #9 on Zero-shot Text Search on BEIR
1 code implementation • 3 Dec 2021 • Haritz Puerto, Gözde Gül Şahin, Iryna Gurevych
The recent explosion of question answering (QA) datasets and models has increased the interest in the generalization of models across multiple domains and formats by either training on multiple datasets or by combining multiple models.
no code implementations • 29 Nov 2021 • Iryna Gurevych, Michael Kohler, Gözde Gül Sahin
Pattern recognition based on a high-dimensional predictor is considered.
1 code implementation • Findings (ACL) 2022 • Jonas Pfeiffer, Gregor Geigle, Aishwarya Kamath, Jan-Martin O. Steitz, Stefan Roth, Ivan Vulić, Iryna Gurevych
In this work, we address this gap and provide xGQA, a new multilingual evaluation benchmark for the visual question answering task.
no code implementations • 9 Sep 2021 • Jan-Martin O. Steitz, Jonas Pfeiffer, Iryna Gurevych, Stefan Roth
Reasoning over multiple modalities, e. g. in Visual Question Answering (VQA), requires an alignment of semantic concepts across domains.
1 code implementation • EMNLP 2021 • Prasetya Ajie Utama, Nafise Sadat Moosavi, Victor Sanh, Iryna Gurevych
Recent prompt-based approaches allow pretrained language models to achieve strong performances on few-shot finetuning by reformulating downstream tasks as a language modeling problem.
1 code implementation • EMNLP 2021 • Leonardo F. R. Ribeiro, Jonas Pfeiffer, Yue Zhang, Iryna Gurevych
Recent work on multilingual AMR-to-text generation has exclusively focused on data augmentation strategies that utilize silver AMR.
no code implementations • 2 Sep 2021 • Nils Dycke, Edwin Simpson, Ilia Kuznetsov, Iryna Gurevych
Peer review is the primary means of quality control in academia; as an outcome of a peer review process, program and area chairs make acceptance decisions for each paper based on the review reports and scores they received.
1 code implementation • ACL 2022 • Tilman Beck, Bela Bohlender, Christina Viehmann, Vincent Hane, Yanik Adamson, Jaber Khuri, Jonas Brossmann, Jonas Pfeiffer, Iryna Gurevych
The open-access dissemination of pretrained language models through online repositories has led to a democratization of state-of-the-art natural language processing (NLP) research.
no code implementations • 1 Jul 2021 • Anne Lauscher, Henning Wachsmuth, Iryna Gurevych, Goran Glavaš
Despite extensive research efforts in recent years, computational argumentation (CA) remains one of the most challenging areas of natural language processing.
1 code implementation • CL (ACL) 2022 • Ji-Ung Lee, Jan-Christoph Klie, Iryna Gurevych
Annotation studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain.
1 code implementation • ACL 2021 • Kevin Stowe, Tuhin Chakrabarty, Nanyun Peng, Smaranda Muresan, Iryna Gurevych
Guided by conceptual metaphor theory, we propose to control the generation process by encoding conceptual mappings between cognitive domains to generate meaningful metaphoric expressions.
1 code implementation • ACL 2021 • Tilman Beck, Ji-Ung Lee, Christina Viehmann, Marcus Maurer, Oliver Quiring, Iryna Gurevych
This work investigates the use of interactively updated label suggestions to improve upon the efficiency of gathering annotations on the task of opinion mining in German Covid-19 social media data.
1 code implementation • 17 Apr 2021 • Aniket Pramanick, Tilman Beck, Kevin Stowe, Iryna Gurevych
Language use changes over time, and this impacts the effectiveness of NLP systems.
3 code implementations • 17 Apr 2021 • Nandan Thakur, Nils Reimers, Andreas Rücklé, Abhishek Srivastava, Iryna Gurevych
To address this, and to facilitate researchers to broadly evaluate the effectiveness of their models, we introduce Benchmarking-IR (BEIR), a robust and heterogeneous evaluation benchmark for information retrieval.
Ranked #1 on Argument Retrieval on ArguAna (BEIR)
2 code implementations • 16 Apr 2021 • Nafise Sadat Moosavi, Andreas Rücklé, Dan Roth, Iryna Gurevych
In this paper, we introduce SciGen, a new challenge dataset for the task of reasoning-aware data-to-text generation consisting of tables from scientific articles and their corresponding descriptions.
1 code implementation • EMNLP 2021 • Clifton Poth, Jonas Pfeiffer, Andreas Rücklé, Iryna Gurevych
Our best methods achieve an average Regret@3 of less than 1% across all target tasks, demonstrating that we are able to efficiently identify the best datasets for intermediate training.
6 code implementations • 14 Apr 2021 • Kexin Wang, Nils Reimers, Iryna Gurevych
Learning sentence embeddings often requires a large amount of labeled data.
Ranked #1 on Re-Ranking on AskUbuntu
1 code implementation • 14 Apr 2021 • Gregor Geigle, Nils Reimers, Andreas Rücklé, Iryna Gurevych
We argue that there exist a wide range of specialized QA agents in literature.
2 code implementations • 1 Apr 2021 • Max Glockner, Ieva Staliūnaitė, James Thorne, Gisela Vallejo, Andreas Vlachos, Iryna Gurevych
Automated fact-checking systems verify claims against evidence to predict their veracity.
1 code implementation • 22 Mar 2021 • Gregor Geigle, Jonas Pfeiffer, Nils Reimers, Ivan Vulić, Iryna Gurevych
Current state-of-the-art approaches to cross-modal retrieval process text and visual input jointly, relying on Transformer-based architectures with cross-attention mechanisms that attend over all words and objects in an image.
1 code implementation • EMNLP 2021 • Leonardo F. R. Ribeiro, Yue Zhang, Iryna Gurevych
Pretrained language models (PLM) have recently advanced graph-to-text generation, where the input graph is linearized into a sequence and fed into the PLM to obtain its representation.
Ranked #1 on Data-to-Text Generation on AMR3.0