no code implementations • Findings (EMNLP) 2021 • Alan Ansell, Edoardo Maria Ponti, Jonas Pfeiffer, Sebastian Ruder, Goran Glavaš, Ivan Vulić, Anna Korhonen
While offering (1) improved fine-tuning efficiency (by a factor of around 50 in our experiments), (2) a smaller parameter budget, and (3) increased language coverage, MAD-G remains competitive with more expensive methods for language-specific adapter training across the board.
no code implementations • WMT (EMNLP) 2021 • Gregor Geigle, Jonas Stadtmüller, Wei Zhao, Jonas Pfeiffer, Steffen Eger
This paper presents our submissions to the WMT2021 Shared Task on Quality Estimation, Task 1 Sentence-Level Direct Assessment.
1 code implementation • 30 May 2023 • Benjamin Minixhofer, Jonas Pfeiffer, Ivan Vulić
Many NLP pipelines split text into sentences as one of the crucial preprocessing steps.
1 code implementation • 23 May 2023 • Benjamin Minixhofer, Jonas Pfeiffer, Ivan Vulić
We first address the data gap by introducing a dataset of 255k compound and non-compound words across 56 diverse languages obtained from Wiktionary.
no code implementations • 23 May 2023 • Jonas Pfeiffer, Francesco Piccinno, Massimo Nicosia, Xinyi Wang, Machel Reid, Sebastian Ruder
Multilingual sequence-to-sequence models perform poorly with increased language coverage and fail to consistently generate text in the correct target language in few-shot settings.
no code implementations • 18 Apr 2023 • Sukannya Purkayastha, Sebastian Ruder, Jonas Pfeiffer, Iryna Gurevych, Ivan Vulić
In order to boost the capacity of mPLMs to deal with low-resource and unseen languages, we explore the potential of leveraging transliteration on a massive scale.
no code implementations • 22 Feb 2023 • Jonas Pfeiffer, Sebastian Ruder, Ivan Vulić, Edoardo Maria Ponti
Modular deep learning has emerged as a promising solution to these challenges.
no code implementations • 13 Jan 2023 • Chen Cecilia Liu, Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych
Our in-depth experiments reveal that scheduled unfreezing induces different learning dynamics compared to standard fine-tuning, and provide evidence that the dynamics of Fisher Information during training correlate with cross-lingual generalization performance.
no code implementations • 12 Oct 2022 • Gregor Geigle, Chen Cecilia Liu, Jonas Pfeiffer, Iryna Gurevych
While many VEs -- of different architectures, trained on different data and objectives -- are publicly available, they are not designed for the downstream V+L tasks.
no code implementations • NAACL 2022 • Jonas Pfeiffer, Naman Goyal, Xi Victoria Lin, Xian Li, James Cross, Sebastian Riedel, Mikel Artetxe
Multilingual pre-trained models are known to suffer from the curse of multilinguality, which causes per-language performance to drop as they cover more languages.
1 code implementation • ACL 2022 • Tim Baumgärtner, Kexin Wang, Rachneet Sachdeva, Max Eichler, Gregor Geigle, Clifton Poth, Hannah Sterz, Haritz Puerto, Leonardo F. R. Ribeiro, Jonas Pfeiffer, Nils Reimers, Gözde Gül Şahin, Iryna Gurevych
Recent advances in NLP and information retrieval have given rise to a diverse set of question answering tasks that are of different formats (e. g., extractive, abstractive), require different model architectures (e. g., generative, discriminative), and setups (e. g., with or without retrieval).
1 code implementation • 15 Feb 2022 • Chen Liu, Jonas Pfeiffer, Anna Korhonen, Ivan Vulić, Iryna Gurevych
2) We analyze cross-lingual VQA across different question types of varying complexity for different multilingual multimodal Transformers, and identify question types that are the most difficult to improve on.
2 code implementations • 27 Jan 2022 • Emanuele Bugliarello, Fangyu Liu, Jonas Pfeiffer, Siva Reddy, Desmond Elliott, Edoardo Maria Ponti, Ivan Vulić
Our benchmark enables the evaluation of multilingual multimodal models for transfer learning, not only in a zero-shot setting, but also in newly defined few-shot learning setups.
1 code implementation • Findings (ACL) 2022 • Jonas Pfeiffer, Gregor Geigle, Aishwarya Kamath, Jan-Martin O. Steitz, Stefan Roth, Ivan Vulić, Iryna Gurevych
In this work, we address this gap and provide xGQA, a new multilingual evaluation benchmark for the visual question answering task.
no code implementations • 9 Sep 2021 • Jan-Martin O. Steitz, Jonas Pfeiffer, Iryna Gurevych, Stefan Roth
Reasoning over multiple modalities, e. g. in Visual Question Answering (VQA), requires an alignment of semantic concepts across domains.
1 code implementation • EMNLP 2021 • Leonardo F. R. Ribeiro, Jonas Pfeiffer, Yue Zhang, Iryna Gurevych
Recent work on multilingual AMR-to-text generation has exclusively focused on data augmentation strategies that utilize silver AMR.
1 code implementation • ACL 2022 • Tilman Beck, Bela Bohlender, Christina Viehmann, Vincent Hane, Yanik Adamson, Jaber Khuri, Jonas Brossmann, Jonas Pfeiffer, Iryna Gurevych
The open-access dissemination of pretrained language models through online repositories has led to a democratization of state-of-the-art natural language processing (NLP) research.
1 code implementation • EMNLP 2021 • Clifton Poth, Jonas Pfeiffer, Andreas Rücklé, Iryna Gurevych
Our best methods achieve an average Regret@3 of less than 1% across all target tasks, demonstrating that we are able to efficiently identify the best datasets for intermediate training.
1 code implementation • 22 Mar 2021 • Gregor Geigle, Jonas Pfeiffer, Nils Reimers, Ivan Vulić, Iryna Gurevych
Current state-of-the-art approaches to cross-modal retrieval process text and visual input jointly, relying on Transformer-based architectures with cross-attention mechanisms that attend over all words and objects in an image.
1 code implementation • ACL 2021 • Phillip Rust, Jonas Pfeiffer, Ivan Vulić, Sebastian Ruder, Iryna Gurevych
In this work, we provide a systematic and comprehensive empirical comparison of pretrained multilingual language models versus their monolingual counterparts with regard to their monolingual task performance.
2 code implementations • EMNLP 2021 • Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder
The ultimate challenge is dealing with under-resourced languages not covered at all by the models and written in scripts unseen during pretraining.
1 code implementation • EMNLP 2021 • Andreas Rücklé, Gregor Geigle, Max Glockner, Tilman Beck, Jonas Pfeiffer, Nils Reimers, Iryna Gurevych
Massively pre-trained transformer models are computationally expensive to fine-tune, slow for inference, and have large storage requirements.
1 code implementation • EMNLP 2020 • Andreas Rücklé, Jonas Pfeiffer, Iryna Gurevych
We investigate the model performances on nine benchmarks of answer selection and question similarity tasks, and show that all 140 models transfer surprisingly well, where the large majority of models substantially outperforms common IR baselines.
5 code implementations • EMNLP 2020 • Jonas Pfeiffer, Andreas Rücklé, Clifton Poth, Aishwarya Kamath, Ivan Vulić, Sebastian Ruder, Kyunghyun Cho, Iryna Gurevych
We propose AdapterHub, a framework that allows dynamic "stitching-in" of pre-trained adapters for different tasks and languages.
3 code implementations • EACL 2021 • Jonas Pfeiffer, Aishwarya Kamath, Andreas Rücklé, Kyunghyun Cho, Iryna Gurevych
We show that by separating the two stages, i. e., knowledge extraction and knowledge composition, the classifier can effectively exploit the representations learned from multiple tasks in a non-destructive manner.
no code implementations • 1 May 2020 • Jonas Pfeiffer, Edwin Simpson, Iryna Gurevych
We compare different models for low resource multi-task sequence tagging that leverage dependencies between label sequences for different tasks.
3 code implementations • EMNLP 2020 • Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder
The main goal behind state-of-the-art pre-trained multilingual models such as multilingual BERT and XLM-R is enabling and bootstrapping NLP applications in low-resource languages through zero-shot or few-shot cross-lingual transfer.
Ranked #5 on
Cross-Lingual Transfer
on XCOPA
(using extra training data)
no code implementations • WS 2019 • Tariq Alhindi, Jonas Pfeiffer, Smaranda Muresan
This paper presents the CUNLP submission for the NLP4IF 2019 shared-task on FineGrained Propaganda Detection.
no code implementations • 10 Sep 2019 • Jonas Pfeiffer, Aishwarya Kamath, Iryna Gurevych, Sebastian Ruder
Recent research towards understanding neural networks probes models in a top-down manner, but is only able to identify model tendencies that are known a priori.
no code implementations • IJCNLP 2019 • Jonas Pfeiffer, Christian M. Meyer, Claudia Schulz, Jan Kiesewetter, Jan Zottmann, Michael Sailer, Elisabeth Bauer, Frank Fischer, Martin R. Fischer, Iryna Gurevych
Our proposed system FAMULUS helps students learn to diagnose based on automatic feedback in virtual patient simulations, and it supports instructors in labeling training data.
no code implementations • WS 2019 • Aishwarya Kamath, Jonas Pfeiffer, Edoardo Maria Ponti, Goran Glava{\v{s}}, Ivan Vuli{\'c}
Semantic specialization methods fine-tune distributional word vectors using lexical knowledge from external resources (e. g. WordNet) to accentuate a particular relation between words.
no code implementations • WS 2018 • Jonas Pfeiffer, Samuel Broscheit, Rainer Gemulla, Mathias G{\"o}schl
In this study, we investigate learning-to-rank and query refinement approaches for information retrieval in the pharmacogenomic domain.