no code implementations • Findings (ACL) 2022 • Evgeniia Razumovskaia, Ivan Vulić, Anna Korhonen
Scaling dialogue systems to a multitude of domains, tasks and languages relies on costly and time-consuming data annotation for different domain-task-language configurations.
no code implementations • Findings (EMNLP) 2021 • Alan Ansell, Edoardo Maria Ponti, Jonas Pfeiffer, Sebastian Ruder, Goran Glavaš, Ivan Vulić, Anna Korhonen
While offering (1) improved fine-tuning efficiency (by a factor of around 50 in our experiments), (2) a smaller parameter budget, and (3) increased language coverage, MAD-G remains competitive with more expensive methods for language-specific adapter training across the board.
1 code implementation • NAACL 2022 • Marinela Parović, Goran Glavaš, Ivan Vulić, Anna Korhonen
Adapter modules enable modular and efficient zero-shot cross-lingual transfer, where current state-of-the-art adapter-based approaches learn specialized language adapters (LAs) for individual languages.
no code implementations • CL (ACL) 2020 • Ivan Vulić, Simon Baker, Edoardo Maria Ponti, Ulla Petti, Ira Leviant, Kelly Wing, Olga Majewska, Eden Bar, Matt Malone, Thierry Poibeau, Roi Reichart, Anna Korhonen
We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering data sets for 12 typologically diverse languages, including major languages (e. g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e. g., Welsh, Kiswahili).
no code implementations • CL (ACL) 2021 • Olga Majewska, Diana McCarthy, Jasper J. F. van den Bosch, Nikolaus Kriegeskorte, Ivan Vulić, Anna Korhonen
We demonstrate how the resultant data set can be used for fine-grained analyses and evaluation of representation learning models on the intrinsic tasks of semantic clustering and semantic similarity.
no code implementations • ACL 2022 • Evgeniia Razumovskaia, Goran Glavaš, Olga Majewska, Edoardo Ponti, Ivan Vulić
In this tutorial, we will thus discuss and demonstrate the importance of (building) multilingual ToD systems, and then provide a systematic overview of current research gaps, challenges and initiatives related to multilingual ToD systems, with a particular focus on their connections to current research and challenges in multilingual and low-resource NLP.
no code implementations • 27 Nov 2024 • Darius Feher, Benjamin Minixhofer, Ivan Vulić
To address these issues, we propose retrofitting LMs with dynamic tokenization: a way to dynamically decide on token boundaries based on the input text.
no code implementations • 3 Oct 2024 • Yinhong Liu, Zhijiang Guo, Tianya Liang, Ehsan Shareghi, Ivan Vulić, Nigel Collier
In this work, we focus on studying logical consistency of LLMs as a prerequisite for more reliable and trustworthy systems.
no code implementations • 26 Sep 2024 • Hannah Sterz, Jonas Pfeiffer, Ivan Vulić
Vision Language Models (VLMs) extend remarkable capabilities of text-only large language models and vision-only models, and are able to learn from and process multi-modal vision-text input.
2 code implementations • 24 Jun 2024 • Markus Frohmann, Igor Sterner, Ivan Vulić, Benjamin Minixhofer, Markus Schedl
We introduce a new model - Segment any Text (SaT) - to solve this problem.
no code implementations • 18 Jun 2024 • Fabian David Schmidt, Philipp Borchert, Ivan Vulić, Goran Glavaš
MT encoders, however, lack the knowledge necessary for comprehensive NLU that LLMs obtain through language modeling training on immense corpora.
2 code implementations • 17 Jun 2024 • Han Zhou, Xingchen Wan, Yinhong Liu, Nigel Collier, Ivan Vulić, Anna Korhonen
Motivated by this phenomenon, we propose an automatic Zero-shot Evaluation-oriented Prompt Optimization framework, ZEPO, which aims to produce fairer preference decisions and improve the alignment of LLM evaluators with human judgments.
1 code implementation • 4 Jun 2024 • Chengzu Li, Caiqi Zhang, Han Zhou, Nigel Collier, Anna Korhonen, Ivan Vulić
In this work, we thus study their capability to understand and reason over spatial relations from the top view.
no code implementations • 17 May 2024 • Jiayun Pang, Ivan Vulić
This suggests that GPU-intensive and expensive pretraining on a large dataset of unlabelled molecules may be useful yet not essential to leverage the power of language models for chemistry.
1 code implementation • 13 May 2024 • Benjamin Minixhofer, Edoardo Maria Ponti, Ivan Vulić
Finally, we show that a ZeTT hypernetwork trained for a base (L)LM can also be applied to fine-tuned variants without extra training.
no code implementations • 3 May 2024 • Yaoyiran Li, Xiang Zhai, Moustafa Alzantot, Keyi Yu, Ivan Vulić, Anna Korhonen, Mohamed Hammad
Building upon the success of Large Language Models (LLMs) in a variety of tasks, researchers have recently explored using LLMs that are pretrained on vast corpora of text for sequential recommendation.
1 code implementation • 25 Mar 2024 • Yinhong Liu, Han Zhou, Zhijiang Guo, Ehsan Shareghi, Ivan Vulić, Anna Korhonen, Nigel Collier
Large Language Models (LLMs) have demonstrated promising capabilities as automatic evaluators in assessing the quality of generated natural language.
no code implementations • 4 Mar 2024 • Evgeniia Razumovskaia, Ivan Vulić, Anna Korhonen
Supervised fine-tuning (SFT), supervised instruction tuning (SIT) and in-context learning (ICL) are three alternative, de facto standard approaches to few-shot learning.
1 code implementation • 15 Feb 2024 • Yaoyiran Li, Anna Korhonen, Ivan Vulić
Recent work has shown that, while large language models (LLMs) demonstrate strong word translation or bilingual lexicon induction (BLI) capabilities in few-shot setups, they still cannot match the performance of 'traditional' mapping-based approaches in the unsupervised scenario where no seed translation pairs are available, especially for lower-resource languages.
Bilingual Lexicon Induction Cross-Lingual Word Embeddings +9
no code implementations • 15 Feb 2024 • Yijiang River Dong, Hongzhou Lin, Mikhail Belkin, Ramon Huerta, Ivan Vulić
Mitigating the retention of sensitive or private information in large language models is essential for enhancing privacy and safety.
Natural Language Understanding parameter-efficient fine-tuning
2 code implementations • 29 Jan 2024 • Alan Ansell, Ivan Vulić, Hannah Sterz, Anna Korhonen, Edoardo M. Ponti
We experiment with instruction-tuning of LLMs on standard dataset mixtures, finding that SpIEL is often superior to popular parameter-efficient fine-tuning methods like LoRA (low-rank adaptation) in terms of performance and comparable in terms of run time.
1 code implementation • 5 Jan 2024 • Paweł Budzianowski, Taras Sereda, Tomasz Cichy, Ivan Vulić
However, certain applications, such as assistive conversational systems, require natural and conversational speech generation tools that also operate efficiently in real time.
2 code implementations • 4 Jan 2024 • Songbo Hu, Xiaobin Wang, Zhangdie Yuan, Anna Korhonen, Ivan Vulić
We present DIALIGHT, a toolkit for developing and evaluating multilingual Task-Oriented Dialogue (ToD) systems which facilitates systematic evaluations and comparisons between ToD systems using fine-tuning of Pretrained Language Models (PLMs) and those utilising the zero-shot and in-context learning capabilities of Large Language Models (LLMs).
1 code implementation • 21 Dec 2023 • Chengzu Li, Han Zhou, Goran Glavaš, Anna Korhonen, Ivan Vulić
Following the standard supervised fine-tuning (SFT) paradigm, in-context learning (ICL) has become an efficient approach propelled by the recent advancements in large language models (LLMs), yielding promising performance across various tasks in few-shot data setups.
1 code implementation • 18 Nov 2023 • Clifton Poth, Hannah Sterz, Indraneil Paul, Sukannya Purkayastha, Leon Engländer, Timo Imhof, Ivan Vulić, Sebastian Ruder, Iryna Gurevych, Jonas Pfeiffer
We introduce Adapters, an open-source library that unifies parameter-efficient and modular transfer learning in large language models.
no code implementations • 16 Nov 2023 • Evgeniia Razumovskaia, Ivan Vulić, Pavle Marković, Tomasz Cichy, Qian Zheng, Tsung-Hsien Wen, Paweł Budzianowski
Factuality is a crucial requirement in information seeking dialogue: the system should respond to the user's queries so that the responses are meaningful and aligned with the knowledge provided to the system.
1 code implementation • 16 Nov 2023 • Evgeniia Razumovskaia, Goran Glavaš, Anna Korhonen, Ivan Vulić
Task-oriented dialogue (ToD) systems help users execute well-defined tasks across a variety of domains (e. g., $\textit{flight booking}$ or $\textit{food ordering}$), with their Natural Language Understanding (NLU) components being dedicated to the analysis of user utterances, predicting users' intents ($\textit{Intent Detection}$, ID) and extracting values for informational slots ($\textit{Value Extraction}$, VE).
no code implementations • 23 Oct 2023 • Anjali Kantharuban, Ivan Vulić, Anna Korhonen
Historically, researchers and consumers have noticed a decrease in quality when applying NLP tools to minority variants of languages (i. e. Puerto Rican Spanish or Swiss German), but studies exploring this have been limited to a select few languages.
1 code implementation • 21 Oct 2023 • Yaoyiran Li, Anna Korhonen, Ivan Vulić
Bilingual Lexicon Induction (BLI) is a core task in multilingual NLP that still, to a large extent, relies on calculating cross-lingual word representations.
Bilingual Lexicon Induction Cross-Lingual Word Embeddings +8
1 code implementation • 19 Oct 2023 • Han Zhou, Xingchen Wan, Ivan Vulić, Anna Korhonen
Prompt-based learning has been an effective paradigm for large pretrained language models (LLM), enabling few-shot or even zero-shot learning.
no code implementations • 19 Oct 2023 • Songbo Hu, Han Zhou, Moy Yuan, Milan Gritta, Guchun Zhang, Ignacio Iacobacci, Anna Korhonen, Ivan Vulić
Achieving robust language technologies that can perform well across the world's many languages is a central goal of multilingual NLP.
1 code implementation • 16 Oct 2023 • Fabian David Schmidt, Ivan Vulić, Goran Glavaš
Because of this, model selection based on source-language validation is unreliable: it picks model snapshots with suboptimal target-language performance.
1 code implementation • 26 Jul 2023 • Songbo Hu, Han Zhou, Mete Hergul, Milan Gritta, Guchun Zhang, Ignacio Iacobacci, Ivan Vulić, Anna Korhonen
Creating high-quality annotated data for task-oriented dialog (ToD) is known to be notoriously difficult, and the challenges are amplified when the goal is to create equitable, culturally adapted, and large-scale ToD datasets for multiple languages.
no code implementations • 4 Jul 2023 • Guangzhi Sun, Chao Zhang, Ivan Vulić, Paweł Budzianowski, Philip C. Woodland
In this work, we propose a Knowledge-Aware Audio-Grounded generative slot-filling framework, termed KA2G, that focuses on few-shot and zero-shot slot filling for ToD with speech input.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +6
no code implementations • 5 Jun 2023 • Marinela Parović, Alan Ansell, Ivan Vulić, Anna Korhonen
We address this mismatch by exposing the task adapter to the target language adapter during training, and empirically validate several variants of the idea: in the simplest form, we alternate between using the source and target language adapters during task adapter training, which can be generalized to cycling over any set of language adapters.
1 code implementation • 2 Jun 2023 • Alan Ansell, Edoardo Maria Ponti, Anna Korhonen, Ivan Vulić
Specifically, we use a two-phase distillation approach, termed BiStil: (i) the first phase distils a general bilingual model from the MMT, while (ii) the second, task-specific phase sparsely fine-tunes the bilingual "student" model using a task-tuned variant of the original MMT as its "teacher".
2 code implementations • 30 May 2023 • Benjamin Minixhofer, Jonas Pfeiffer, Ivan Vulić
Many NLP pipelines split text into sentences as one of the crucial preprocessing steps.
no code implementations • 30 May 2023 • Yaoyiran Li, Ching-Yun Chang, Stephen Rawls, Ivan Vulić, Anna Korhonen
Research on text-to-image generation (TTI) still predominantly focuses on the English language due to the lack of annotated image-caption data in other languages; in the long run, this might widen inequitable access to TTI technology.
Cross-lingual Text-to-Image Generation Crosslingual Text-to-Image Generation +6
1 code implementation • 26 May 2023 • Fabian David Schmidt, Ivan Vulić, Goran Glavaš
The results indicate that averaging model checkpoints yields systematic and consistent performance gains across diverse target languages in all tasks.
1 code implementation • 23 May 2023 • Benjamin Minixhofer, Jonas Pfeiffer, Ivan Vulić
We first address the data gap by introducing a dataset of 255k compound and non-compound words across 56 diverse languages obtained from Wiktionary.
no code implementations • 22 May 2023 • Evgeniia Razumovskaia, Ivan Vulić, Anna Korhonen
It is especially effective for the most challenging transfer-free few-shot setups, paving the way for quick and data-efficient bootstrapping of multilingual slot labelers for ToD.
no code implementations • 18 Apr 2023 • Vésteinn Snæbjarnarson, Annika Simonsen, Goran Glavaš, Ivan Vulić
Multilingual language models have pushed state-of-the-art in cross-lingual NLP transfer.
no code implementations • 18 Apr 2023 • Sukannya Purkayastha, Sebastian Ruder, Jonas Pfeiffer, Iryna Gurevych, Ivan Vulić
In order to boost the capacity of mPLMs to deal with low-resource and unseen languages, we explore the potential of leveraging transliteration on a massive scale.
no code implementations • 22 Feb 2023 • Jonas Pfeiffer, Sebastian Ruder, Ivan Vulić, Edoardo Maria Ponti
Modular deep learning has emerged as a promising solution to these challenges.
1 code implementation • 28 Jan 2023 • Han Zhou, Xingchen Wan, Ivan Vulić, Anna Korhonen
Large pretrained language models are widely used in downstream NLP tasks via task-specific fine-tuning, but such procedures can be costly.
1 code implementation • 13 Jan 2023 • Chen Cecilia Liu, Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych
Our experiments reveal that scheduled unfreezing induces different learning dynamics compared to standard fine-tuning, and provide evidence that the dynamics of Fisher Information during training correlate with cross-lingual generalization performance.
no code implementations • 20 Dec 2022 • Nikita Moghe, Evgeniia Razumovskaia, Liane Guillou, Ivan Vulić, Anna Korhonen, Alexandra Birch
We use MULTI3NLU++ to benchmark state-of-the-art multilingual models for the NLU tasks of intent detection and slot labelling for TOD systems in the multilingual setting.
1 code implementation • 7 Nov 2022 • Songbo Hu, Ivan Vulić, Fangyu Liu, Anna Korhonen
At training, the high-scoring partition comprises all generated responses whose similarity to the gold response is higher than the similarity of the greedy response to the gold response.
1 code implementation • 30 Oct 2022 • Yaoyiran Li, Fangyu Liu, Ivan Vulić, Anna Korhonen
This crucial step is done via 1) creating a word similarity dataset, comprising positive word pairs (i. e., true translations) and hard negative pairs induced from the original CLWE space, and then 2) fine-tuning an mPLM (e. g., mBERT or XLM-R) in a cross-encoder manner to predict the similarity scores.
Bilingual Lexicon Induction Cross-Lingual Word Embeddings +7
1 code implementation • 12 Oct 2022 • Zhangdie Yuan, Songbo Hu, Ivan Vulić, Anna Korhonen, Zaiqiao Meng
Acquiring factual knowledge with Pretrained Language Models (PLMs) has attracted increasing attention, showing promising performance in many knowledge-intensive tasks.
1 code implementation • Proceedings of the Conference on Empirical Methods in Natural Language Processing 2022 • Fabian David Schmidt, Ivan Vulić, Goran Glavaš
Large multilingual language models generally demonstrate impressive results in zero-shot cross-lingual transfer, yet often fail to successfully transfer to low-resource languages, even for token-level prediction tasks like named entity recognition (NER).
Multilingual text classification named-entity-recognition +3
1 code implementation • Findings (ACL) 2022 • Sebastian Ruder, Ivan Vulić, Anders Søgaard
Most work targeting multilinguality, for example, considers only accuracy; most work on fairness or interpretability considers only English; and so on.
1 code implementation • NAACL 2022 • Chia-Chien Hung, Anne Lauscher, Ivan Vulić, Simone Paolo Ponzetto, Goran Glavaš
We then introduce a new framework for multilingual conversational specialization of pretrained language models (PrLMs) that aims to facilitate cross-lingual transfer for arbitrary downstream TOD tasks.
no code implementations • 30 Apr 2022 • Ivan Vulić, Goran Glavaš, Fangyu Liu, Nigel Collier, Edoardo Maria Ponti, Anna Korhonen
In this work, we probe SEs for the amount of cross-lingual lexical knowledge stored in their parameters, and compare them against the original multilingual LMs.
1 code implementation • Findings (NAACL) 2022 • Georgios P. Spithourakis, Ivan Vulić, Michał Lis, Iñigo Casanueva, Paweł Budzianowski
Knowledge-based authentication is crucial for task-oriented spoken dialogue systems that offer personalised and privacy-focused services.
Ranked #1 on Speaker Identification on EVI fr-FR
1 code implementation • Findings (NAACL) 2022 • Iñigo Casanueva, Ivan Vulić, Georgios P. Spithourakis, Paweł Budzianowski
2) The ontology is divided into domain-specific and generic (i. e., domain-universal) intent modules that overlap across domains, promoting cross-domain reusability of annotated examples.
no code implementations • 5 Apr 2022 • Gabor Fuisz, Ivan Vulić, Samuel Gibbons, Inigo Casanueva, Paweł Budzianowski
In particular, we focus on modeling and studying \textit{slot labeling} (SL), a crucial component of NLU for dialog, through the QA optics, aiming to improve both its performance and efficiency, and make it more effective and resilient to working with limited task data.
1 code implementation • COLING 2022 • Robert Litschko, Ivan Vulić, Goran Glavaš
Current approaches therefore commonly transfer rankers trained on English data to other languages and cross-lingual setups by means of multilingual encoders: they fine-tune all parameters of pretrained massively multilingual Transformers (MMTs, e. g., multilingual BERT) on English relevance judgments, and then deploy them in the target language(s).
1 code implementation • ACL 2022 • Yaoyiran Li, Fangyu Liu, Nigel Collier, Anna Korhonen, Ivan Vulić
At Stage C1, we propose to refine standard cross-lingual linear maps between static word embeddings (WEs) via a contrastive learning objective; we also show how to integrate it into the self-learning procedure for even more refined cross-lingual maps.
1 code implementation • 15 Feb 2022 • Chen Liu, Jonas Pfeiffer, Anna Korhonen, Ivan Vulić, Iryna Gurevych
2) We analyze cross-lingual VQA across different question types of varying complexity for different multilingual multimodal Transformers, and identify question types that are the most difficult to improve on.
no code implementations • 31 Jan 2022 • Olga Majewska, Evgeniia Razumovskaia, Edoardo Maria Ponti, Ivan Vulić, Anna Korhonen
Through this process we annotate a new large-scale dataset for training and evaluation of multilingual and cross-lingual ToD systems.
3 code implementations • 27 Jan 2022 • Emanuele Bugliarello, Fangyu Liu, Jonas Pfeiffer, Siva Reddy, Desmond Elliott, Edoardo Maria Ponti, Ivan Vulić
Our benchmark enables the evaluation of multilingual multimodal models for transfer learning, not only in a zero-shot setting, but also in newly defined few-shot learning setups.
1 code implementation • 21 Dec 2021 • Robert Litschko, Ivan Vulić, Simone Paolo Ponzetto, Goran Glavaš
In this work we present a systematic empirical study focused on the suitability of the state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks across a number of diverse language pairs.
1 code implementation • ACL 2022 • Wenxuan Zhou, Fangyu Liu, Ivan Vulić, Nigel Collier, Muhao Chen
To achieve this, it is crucial to represent multilingual knowledge in a shared/unified space.
2 code implementations • ACL 2022 • Alan Ansell, Edoardo Maria Ponti, Anna Korhonen, Ivan Vulić
Both these masks can then be composed with the pretrained model.
no code implementations • EMNLP 2021 • Ivan Vulić, Pei-Hao Su, Sam Coope, Daniela Gerz, Paweł Budzianowski, Iñigo Casanueva, Nikola Mrkšić, Tsung-Hsien Wen
Transformer-based language models (LMs) pretrained on large text collections are proven to store a wealth of semantic knowledge.
1 code implementation • CoNLL (EMNLP) 2021 • Qianchu Liu, Fangyu Liu, Nigel Collier, Anna Korhonen, Ivan Vulić
Recent work indicated that pretrained language models (PLMs) such as BERT and RoBERTa can be transformed into effective sentence and word encoders even via simple self-supervised techniques.
1 code implementation • Findings (ACL) 2022 • Jonas Pfeiffer, Gregor Geigle, Aishwarya Kamath, Jan-Martin O. Steitz, Stefan Roth, Ivan Vulić, Iryna Gurevych
In this work, we address this gap and provide xGQA, a new multilingual evaluation benchmark for the visual question answering task.
no code implementations • IJCNLP 2019 • Edoardo Maria Ponti, Ivan Vulić, Ryan Cotterell, Roi Reichart, Anna Korhonen
Motivated by this question, we aim at constructing an informative prior over neural weights, in order to adapt quickly to held-out languages in the task of character-level language modeling.
1 code implementation • 23 Jul 2021 • Edoardo Maria Ponti, Julia Kreutzer, Ivan Vulić, Siva Reddy
To remedy this, we propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model, by treating the intermediate translations as a latent random variable.
1 code implementation • ACL 2021 • Soumya Barikeri, Anne Lauscher, Ivan Vulić, Goran Glavaš
We use the evaluation framework to benchmark the widely used conversational DialoGPT model along with the adaptations of four debiasing methods.
1 code implementation • ACL 2021 • Fangyu Liu, Ivan Vulić, Anna Korhonen, Nigel Collier
To this end, we propose and evaluate a series of cross-lingual transfer methods for the XL-BEL task, and demonstrate that general-domain bitext helps propagate the available English knowledge to languages with little to no in-domain data.
no code implementations • 17 Apr 2021 • Evgeniia Razumovskaia, Goran Glavaš, Olga Majewska, Edoardo M. Ponti, Anna Korhonen, Ivan Vulić
We find that the most critical factor preventing the creation of truly multilingual ToD systems is the lack of datasets in most languages for both training and evaluation.
1 code implementation • EMNLP 2021 • Qianchu Liu, Edoardo M. Ponti, Diana McCarthy, Ivan Vulić, Anna Korhonen
In order to address these gaps, we present AM2iCo (Adversarial and Multilingual Meaning in Context), a wide-coverage cross-lingual and multilingual evaluation set; it aims to faithfully assess the ability of state-of-the-art (SotA) representation models to understand the identity of word meaning in cross-lingual contexts for 14 language pairs.
no code implementations • EMNLP 2021 • Daniela Gerz, Pei-Hao Su, Razvan Kusztos, Avishek Mondal, Michał Lis, Eshan Singhal, Nikola Mrkšić, Tsung-Hsien Wen, Ivan Vulić
We present a systematic study on multilingual and cross-lingual intent detection from spoken data.
1 code implementation • EMNLP 2021 • Fangyu Liu, Ivan Vulić, Anna Korhonen, Nigel Collier
In this work, we demonstrate that it is possible to turn MLMs into effective universal lexical and sentence encoders even without any additional data and without any supervision.
Ranked #15 on Semantic Textual Similarity on STS16
Contrastive Learning Cross-Lingual Semantic Textual Similarity +5
1 code implementation • 22 Mar 2021 • Gregor Geigle, Jonas Pfeiffer, Nils Reimers, Ivan Vulić, Iryna Gurevych
Current state-of-the-art approaches to cross-modal retrieval process text and visual input jointly, relying on Transformer-based architectures with cross-attention mechanisms that attend over all words and objects in an image.
1 code implementation • 21 Jan 2021 • Robert Litschko, Ivan Vulić, Simone Paolo Ponzetto, Goran Glavaš
Therefore, in this work we present a systematic empirical study focused on the suitability of the state-of-the-art multilingual encoders for cross-lingual document and sentence retrieval tasks across a large number of language pairs.
no code implementations • ACL 2021 • Olga Majewska, Ivan Vulić, Goran Glavaš, Edoardo M. Ponti, Anna Korhonen
We investigate whether injecting explicit information on verbs' semantic-syntactic behaviour improves the performance of LM-pretrained Transformers in event extraction tasks -- downstream tasks for which accurate verb processing is paramount.
no code implementations • ACL 2021 • Mengjie Zhao, Yi Zhu, Ehsan Shareghi, Ivan Vulić, Roi Reichart, Anna Korhonen, Hinrich Schütze
Few-shot crosslingual transfer has been shown to outperform its zero-shot counterpart with pretrained encoders like multilingual BERT.
2 code implementations • EMNLP 2021 • Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder
The ultimate challenge is dealing with under-resourced languages not covered at all by the models and written in scripts unseen during pretraining.
1 code implementation • ACL 2021 • Phillip Rust, Jonas Pfeiffer, Ivan Vulić, Sebastian Ruder, Iryna Gurevych
In this work, we provide a systematic and comprehensive empirical comparison of pretrained multilingual language models versus their monolingual counterparts with regard to their monolingual task performance.
no code implementations • 11 Dec 2020 • Marko Vidoni, Ivan Vulić, Goran Glavaš
Adapter modules, additional trainable parameters that enable efficient fine-tuning of pretrained transformers, have recently been used for language specialization of multilingual transformers, improving downstream zero-shot cross-lingual transfer.
1 code implementation • COLING 2020 • Yaoyiran Li, Edoardo M. Ponti, Ivan Vulić, Anna Korhonen
On the other hand, this also provides an extrinsic evaluation protocol to probe the properties of emergent languages ex vitro.
no code implementations • NAACL 2021 • Matthew Henderson, Ivan Vulić
We propose ConVEx (Conversational Value Extractor), an efficient pretraining and fine-tuning neural approach for slot-labeling dialog tasks.
no code implementations • EMNLP 2020 • Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš, Anna Korhonen
The success of large pretrained language models (LMs) such as BERT and RoBERTa has sparked interest in probing their representations, in order to unveil what types of knowledge they implicitly capture.
3 code implementations • 15 Aug 2020 • Goran Glavaš, Ivan Vulić
Traditional NLP has long held (supervised) syntactic parsing necessary for successful higher-level semantic language understanding (LU).
8 code implementations • EMNLP 2020 • Jonas Pfeiffer, Andreas Rücklé, Clifton Poth, Aishwarya Kamath, Ivan Vulić, Sebastian Ruder, Kyunghyun Cho, Iryna Gurevych
We propose AdapterHub, a framework that allows dynamic "stitching-in" of pre-trained adapters for different tasks and languages.
1 code implementation • ACL 2020 • Sam Coope, Tyler Farghly, Daniela Gerz, Ivan Vulić, Matthew Henderson
We introduce Span-ConveRT, a light-weight model for dialog slot-filling which frames the task as a turn-based span extraction task.
1 code implementation • ACL 2020 • Daniela Gerz, Ivan Vulić, Marek Rei, Roi Reichart, Anna Korhonen
We present a neural framework for learning associations between interrelated groups of words such as the ones found in Subject-Verb-Object (SVO) structures.
1 code implementation • EMNLP 2020 • Edoardo Maria Ponti, Goran Glavaš, Olga Majewska, Qianchu Liu, Ivan Vulić, Anna Korhonen
In order to simulate human language capacity, natural language processing systems must be able to reason about the dynamics of everyday situations, including their possible causes and effects.
Ranked #3 on Cross-Lingual Transfer on XCOPA (using extra training data)
no code implementations • 1 May 2020 • Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš
Massively multilingual transformers pretrained with language modeling objectives (e. g., mBERT, XLM-R) have become a de facto default transfer paradigm for zero-shot cross-lingual transfer in NLP, offering unmatched transfer performance.
3 code implementations • EMNLP 2020 • Jonas Pfeiffer, Ivan Vulić, Iryna Gurevych, Sebastian Ruder
The main goal behind state-of-the-art pre-trained multilingual models such as multilingual BERT and XLM-R is enabling and bootstrapping NLP applications in low-resource languages through zero-shot or few-shot cross-lingual transfer.
Ranked #5 on Cross-Lingual Transfer on XCOPA (using extra training data)
no code implementations • COLING 2020 • Robert Litschko, Ivan Vulić, Željko Agić, Goran Glavaš
Current methods of cross-lingual parser transfer focus on predicting the best parser for a low-resource target language globally, that is, "at treebank level".
1 code implementation • EMNLP 2020 • Ivan Vulić, Sebastian Ruder, Anders Søgaard
Existing algorithms for aligning cross-lingual word vector spaces assume that vector spaces are approximately isomorphic.
5 code implementations • WS 2020 • Iñigo Casanueva, Tadas Temčinas, Daniela Gerz, Matthew Henderson, Ivan Vulić
Building conversational systems in new domains and with added functionality requires resource-efficient models that work under low-data regimes (i. e., in few-shot setups).
no code implementations • 10 Mar 2020 • Ivan Vulić, Simon Baker, Edoardo Maria Ponti, Ulla Petti, Ira Leviant, Kelly Wing, Olga Majewska, Eden Bar, Matt Malone, Thierry Poibeau, Roi Reichart, Anna Korhonen
We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering datasets for 12 typologically diverse languages, including major languages (e. g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e. g., Welsh, Kiswahili).
1 code implementation • 30 Jan 2020 • Edoardo M. Ponti, Ivan Vulić, Ryan Cotterell, Marinela Parovic, Roi Reichart, Anna Korhonen
In this work, we propose a Bayesian generative model for the space of neural parameters.
no code implementations • EMNLP 2020 • Haim Dubossarsky, Ivan Vulić, Roi Reichart, Anna Korhonen
Performance in cross-lingual NLP tasks is impacted by the (dis)similarity of languages at hand: e. g., previous work has suggested there is a connection between the expected success of bilingual lexicon induction (BLI) and the assumption of (approximate) isomorphism between monolingual embedding spaces.
5 code implementations • Findings of the Association for Computational Linguistics 2020 • Matthew Henderson, Iñigo Casanueva, Nikola Mrkšić, Pei-Hao Su, Tsung-Hsien Wen, Ivan Vulić
General-purpose pretrained sentence encoders such as BERT are not ideal for real-world conversational AI applications; they are computationally heavy, slow, and expensive to train.
Ranked #1 on Conversational Response Selection on PolyAI Reddit
no code implementations • CONLL 2019 • Yi Zhu, Benjamin Heinzerling, Ivan Vulić, Michael Strube, Roi Reichart, Anna Korhonen
Recent work has validated the importance of subword information for word representation learning.
4 code implementations • 13 Sep 2019 • Anne Lauscher, Goran Glavaš, Simone Paolo Ponzetto, Ivan Vulić
Moreover, we successfully transfer debiasing models, by means of cross-lingual embedding spaces, and remove or attenuate biases in distributional word vector spaces of languages that lack readily available bias specifications.
1 code implementation • COLING 2020 • Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti, Anna Korhonen, Goran Glavaš
In this work, we complement such distributional knowledge with external lexical knowledge, that is, we integrate the discrete knowledge on word-level semantic similarity into pretraining.
1 code implementation • IJCNLP 2019 • Ivan Vulić, Goran Glavaš, Roi Reichart, Anna Korhonen
A series of bilingual lexicon induction (BLI) experiments with 15 diverse languages (210 language pairs) show that fully unsupervised CLWE methods still fail for a large number of language pairs (e. g., they yield zero BLI performance for 87/210 pairs).
no code implementations • IJCNLP 2019 • Matthew Henderson, Ivan Vulić, Iñigo Casanueva, Paweł Budzianowski, Daniela Gerz, Sam Coope, Georgios Spithourakis, Tsung-Hsien Wen, Nikola Mrkšić, Pei-Hao Su
We present PolyResponse, a conversational search engine that supports task-oriented dialogue.
1 code implementation • 12 Jul 2019 • Paweł Budzianowski, Ivan Vulić
Data scarcity is a long-standing and crucial challenge that hinders quick development of task-oriented dialogue systems across multiple domains: task-oriented dialogue models are expected to learn grammar, syntax, dialogue reasoning, decision making, and language generation from absurdly small amounts of task-specific data.
1 code implementation • ACL 2019 • Matthew Henderson, Ivan Vulić, Daniela Gerz, Iñigo Casanueva, Paweł Budzianowski, Sam Coope, Georgios Spithourakis, Tsung-Hsien Wen, Nikola Mrkšić, Pei-Hao Su
Despite their popularity in the chatbot literature, retrieval-based models have had modest impact on task-oriented dialogue systems, with the main obstacle to their application being the low-data regime of most task-oriented dialogue tasks.
1 code implementation • NAACL 2019 • Yi Zhu, Ivan Vulić, Anna Korhonen
The use of subword-level information (e. g., characters, character n-grams, morphemes) has become ubiquitous in modern word representation learning.
3 code implementations • WS 2019 • Matthew Henderson, Paweł Budzianowski, Iñigo Casanueva, Sam Coope, Daniela Gerz, Girish Kumar, Nikola Mrkšić, Georgios Spithourakis, Pei-Hao Su, Ivan Vulić, Tsung-Hsien Wen
Progress in Machine Learning is often driven by the availability of large datasets, and consistent evaluation metrics for comparing modeling approaches.
BIG-bench Machine Learning Conversational Response Selection +1
1 code implementation • EMNLP 2018 • Edoardo Maria Ponti, Ivan Vulić, Goran Glavaš, Nikola Mrkšić, Anna Korhonen
Our adversarial post-specialization method propagates the external lexical knowledge to the full distributional space.
no code implementations • CL 2019 • Edoardo Maria Ponti, Helen O'Horan, Yevgeni Berzak, Ivan Vulić, Roi Reichart, Thierry Poibeau, Ekaterina Shutova, Anna Korhonen
Linguistic typology aims to capture structural and semantic variation across the world's languages.
1 code implementation • 29 May 2018 • Nikola Mrkšić, Ivan Vulić
This paper proposes an improvement to the existing data-driven Neural Belief Tracking (NBT) framework for Dialogue State Tracking (DST).
1 code implementation • ACL 2018 • Marek Rei, Daniela Gerz, Ivan Vulić
Experiments show excellent performance on scoring graded lexical entailment, raising the state-of-the-art on the HyperLex dataset by approximately 25%.
no code implementations • ACL 2018 • Anders Søgaard, Sebastian Ruder, Ivan Vulić
Unsupervised machine translation---i. e., not assuming any cross-lingual supervision signal, whether a dictionary, translations, or comparable corpora---seems impossible, but nevertheless, Lample et al. (2018) recently proposed a fully unsupervised machine translation (MT) model.
1 code implementation • NAACL 2018 • Ivan Vulić, Goran Glavaš, Nikola Mrkšić, Anna Korhonen
Word vector specialisation (also known as retrofitting) is a portable, light-weight approach to fine-tuning arbitrary distributional word vector spaces by injecting external knowledge from rich lexical resources such as WordNet.
1 code implementation • 2 May 2018 • Robert Litschko, Goran Glavaš, Simone Paolo Ponzetto, Ivan Vulić
We propose a fully unsupervised framework for ad-hoc cross-lingual information retrieval (CLIR) which requires no bilingual data at all.
1 code implementation • 17 Oct 2017 • Ivan Vulić, Nikola Mrkšić
We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that transforms any input word vector space to emphasise the asymmetric relation of lexical entailment (LE), also known as the IS-A or hyponymy-hypernymy relation.
no code implementations • EMNLP 2017 • Ivan Vulić, Nikola Mrkšić, Anna Korhonen
Existing approaches to automatic VerbNet-style verb classification are heavily dependent on feature engineering and therefore limited to languages with mature NLP pipelines.
no code implementations • 15 Jun 2017 • Sebastian Ruder, Ivan Vulić, Anders Søgaard
Cross-lingual representations of words enable us to reason about word meaning in multilingual contexts and are a key facilitator of cross-lingual transfer when developing natural language processing models for low-resource languages.
no code implementations • ACL 2017 • Ivan Vulić, Nikola Mrkšić, Roi Reichart, Diarmuid Ó Séaghdha, Steve Young, Anna Korhonen
Morphologically rich languages accentuate two properties of distributional vector space models: 1) the difficulty of inducing accurate representations for low-frequency word forms; and 2) insensitivity to distinct lexical relations that have similar distributional signatures.
2 code implementations • 1 Jun 2017 • Nikola Mrkšić, Ivan Vulić, Diarmuid Ó Séaghdha, Ira Leviant, Roi Reichart, Milica Gašić, Anna Korhonen, Steve Young
We present Attract-Repel, an algorithm for improving the semantic quality of word vectors by injecting constraints extracted from lexical resources.
no code implementations • SEMEVAL 2017 • Edoardo Maria Ponti, Ivan Vulić, Anna Korhonen
Distributed representations of sentences have been developed recently to represent their meaning as real-valued vectors.
no code implementations • COLING 2016 • Helen O'Horan, Yevgeni Berzak, Ivan Vulić, Roi Reichart, Anna Korhonen
In recent years linguistic typology, which classifies the world's languages according to their functional and structural properties, has been widely used to support multilingual NLP.
no code implementations • CONLL 2017 • Ivan Vulić, Roy Schwartz, Ari Rappoport, Roi Reichart, Anna Korhonen
With our selected context configurations, we train on only 14% (A), 26. 2% (V), and 33. 6% (N) of all dependency-based contexts, resulting in a reduced training time.
no code implementations • CL 2017 • Ivan Vulić, Daniela Gerz, Douwe Kiela, Felix Hill, Anna Korhonen
We introduce HyperLex - a dataset and evaluation resource that quantifies the extent of of the semantic category membership, that is, type-of relation also known as hyponymy-hypernymy or lexical entailment (LE) relation between 2, 616 concept pairs.
1 code implementation • EMNLP 2016 • Daniela Gerz, Ivan Vulić, Felix Hill, Roi Reichart, Anna Korhonen
Verbs play a critical role in the meaning of sentences, but these ubiquitous words have received little attention in recent distributional semantics research.
no code implementations • 24 Sep 2015 • Ivan Vulić, Marie-Francine Moens
We propose a new model for learning bilingual word representations from non-parallel document-aligned data.