no code implementations • CL (ACL) 2021 • Olga Majewska, Diana McCarthy, Jasper J. F. van den Bosch, Nikolaus Kriegeskorte, Ivan Vulić, Anna Korhonen
We demonstrate how the resultant data set can be used for fine-grained analyses and evaluation of representation learning models on the intrinsic tasks of semantic clustering and semantic similarity.
no code implementations • ACL 2022 • Evgeniia Razumovskaia, Goran Glavaš, Olga Majewska, Edoardo Ponti, Ivan Vulić
In this tutorial, we will thus discuss and demonstrate the importance of (building) multilingual ToD systems, and then provide a systematic overview of current research gaps, challenges and initiatives related to multilingual ToD systems, with a particular focus on their connections to current research and challenges in multilingual and low-resource NLP.
no code implementations • CL (ACL) 2020 • Ivan Vulić, Simon Baker, Edoardo Maria Ponti, Ulla Petti, Ira Leviant, Kelly Wing, Olga Majewska, Eden Bar, Matt Malone, Thierry Poibeau, Roi Reichart, Anna Korhonen
We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering data sets for 12 typologically diverse languages, including major languages (e. g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e. g., Welsh, Kiswahili).
no code implementations • 31 Jan 2022 • Olga Majewska, Evgeniia Razumovskaia, Edoardo Maria Ponti, Ivan Vulić, Anna Korhonen
Through this process we annotate a new large-scale dataset for training and evaluation of multilingual and cross-lingual ToD systems.
no code implementations • 17 Apr 2021 • Evgeniia Razumovskaia, Goran Glavaš, Olga Majewska, Edoardo M. Ponti, Anna Korhonen, Ivan Vulić
We find that the most critical factor preventing the creation of truly multilingual ToD systems is the lack of datasets in most languages for both training and evaluation.
no code implementations • ACL 2021 • Olga Majewska, Ivan Vulić, Goran Glavaš, Edoardo M. Ponti, Anna Korhonen
We investigate whether injecting explicit information on verbs' semantic-syntactic behaviour improves the performance of LM-pretrained Transformers in event extraction tasks -- downstream tasks for which accurate verb processing is paramount.
1 code implementation • COLING 2020 • Olga Majewska, Ivan Vuli{\'c}, Diana McCarthy, Anna Korhonen
We present the first evaluation of the applicability of a spatial arrangement method (SpAM) to a typologically diverse language sample, and its potential to produce semantic evaluation resources to support multilingual NLP, with a focus on verb semantics.
1 code implementation • EMNLP (DeeLIO) 2020 • Anne Lauscher, Olga Majewska, Leonardo F. R. Ribeiro, Iryna Gurevych, Nikolai Rozanov, Goran Glavaš
Following the major success of neural language models (LMs) such as BERT or GPT-2 on a variety of language understanding tasks, recent work focused on injecting (structured) knowledge from external resources into these models.
1 code implementation • EMNLP 2020 • Edoardo Maria Ponti, Goran Glavaš, Olga Majewska, Qianchu Liu, Ivan Vulić, Anna Korhonen
In order to simulate human language capacity, natural language processing systems must be able to reason about the dynamics of everyday situations, including their possible causes and effects.
Ranked #3 on Cross-Lingual Transfer on XCOPA (using extra training data)
no code implementations • LREC 2020 • Olga Majewska, Diana McCarthy, Jasper van den Bosch, Nikolaus Kriegeskorte, Ivan Vuli{\'c}, Anna Korhonen
We present a novel methodology for fast bottom-up creation of large-scale semantic similarity resources to support development and evaluation of NLP systems.
no code implementations • 10 Mar 2020 • Ivan Vulić, Simon Baker, Edoardo Maria Ponti, Ulla Petti, Ira Leviant, Kelly Wing, Olga Majewska, Eden Bar, Matt Malone, Thierry Poibeau, Roi Reichart, Anna Korhonen
We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering datasets for 12 typologically diverse languages, including major languages (e. g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e. g., Welsh, Kiswahili).