no code implementations • CL (ACL) 2020 • Ivan Vulić, Simon Baker, Edoardo Maria Ponti, Ulla Petti, Ira Leviant, Kelly Wing, Olga Majewska, Eden Bar, Matt Malone, Thierry Poibeau, Roi Reichart, Anna Korhonen
We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering data sets for 12 typologically diverse languages, including major languages (e. g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e. g., Welsh, Kiswahili).
no code implementations • Findings (EMNLP) 2021 • Alan Ansell, Edoardo Maria Ponti, Jonas Pfeiffer, Sebastian Ruder, Goran Glavaš, Ivan Vulić, Anna Korhonen
While offering (1) improved fine-tuning efficiency (by a factor of around 50 in our experiments), (2) a smaller parameter budget, and (3) increased language coverage, MAD-G remains competitive with more expensive methods for language-specific adapter training across the board.
no code implementations • 19 Oct 2023 • Nico Daheim, Thomas Möllenhoff, Edoardo Maria Ponti, Iryna Gurevych, Mohammad Emtiyaz Khan
Models trained on different datasets can be merged by a weighted-averaging of their parameters, but why does it work and when can it fail?
1 code implementation • 2 Jun 2023 • Alan Ansell, Edoardo Maria Ponti, Anna Korhonen, Ivan Vulić
Specifically, we use a two-phase distillation approach, termed BiStil: (i) the first phase distils a general bilingual model from the MMT, while (ii) the second, task-specific phase sparsely fine-tunes the bilingual "student" model using a task-tuned variant of the original MMT as its "teacher".
no code implementations • 22 Feb 2023 • Jonas Pfeiffer, Sebastian Ruder, Ivan Vulić, Edoardo Maria Ponti
Modular deep learning has emerged as a promising solution to these challenges.
no code implementations • 30 Apr 2022 • Ivan Vulić, Goran Glavaš, Fangyu Liu, Nigel Collier, Edoardo Maria Ponti, Anna Korhonen
In this work, we probe SEs for the amount of cross-lingual lexical knowledge stored in their parameters, and compare them against the original multilingual LMs.
no code implementations • 31 Jan 2022 • Olga Majewska, Evgeniia Razumovskaia, Edoardo Maria Ponti, Ivan Vulić, Anna Korhonen
Through this process we annotate a new large-scale dataset for training and evaluation of multilingual and cross-lingual ToD systems.
2 code implementations • 27 Jan 2022 • Emanuele Bugliarello, Fangyu Liu, Jonas Pfeiffer, Siva Reddy, Desmond Elliott, Edoardo Maria Ponti, Ivan Vulić
Our benchmark enables the evaluation of multilingual multimodal models for transfer learning, not only in a zero-shot setting, but also in newly defined few-shot learning setups.
2 code implementations • ACL 2022 • Alan Ansell, Edoardo Maria Ponti, Anna Korhonen, Ivan Vulić
Both these masks can then be composed with the pretrained model.
2 code implementations • EMNLP 2021 • Fangyu Liu, Emanuele Bugliarello, Edoardo Maria Ponti, Siva Reddy, Nigel Collier, Desmond Elliott
The design of widespread vision-and-language datasets and pre-trained encoders directly adopts, or draws inspiration from, the concepts and images of ImageNet.
Ranked #1 on
Zero-Shot Cross-Lingual Transfer
on MaRVL
no code implementations • IJCNLP 2019 • Edoardo Maria Ponti, Ivan Vulić, Ryan Cotterell, Roi Reichart, Anna Korhonen
Motivated by this question, we aim at constructing an informative prior over neural weights, in order to adapt quickly to held-out languages in the task of character-level language modeling.
no code implementations • ACL 2021 • Ivan Vuli{\'c}, Edoardo Maria Ponti, Anna Korhonen, Goran Glava{\v{s}}
Inspired by prior work on semantic specialization of static word embedding (WE) models, we show that it is possible to expose and enrich lexical knowledge from the LMs, that is, to specialize them to serve as effective and universal {``}decontextualized{''} word encoders even when fed input words {``}in isolation{''} (i. e., without any context).
1 code implementation • 23 Jul 2021 • Edoardo Maria Ponti, Julia Kreutzer, Ivan Vulić, Siva Reddy
To remedy this, we propose a new technique that integrates both steps of the traditional pipeline (translation and classification) into a single model, by treating the intermediate translations as a latent random variable.
1 code implementation • 2 Jun 2021 • Edoardo Maria Ponti, Rahul Aralikatte, Disha Shrivastava, Siva Reddy, Anders Søgaard
In fact, under a decision-theoretic framework, MAML can be interpreted as minimising the expected risk across training languages (with a uniform prior), which is known as Bayes criterion.
1 code implementation • 10 Feb 2021 • Shijie Wu, Edoardo Maria Ponti, Ryan Cotterell
As the main contribution of our work, we implement the phonological generative system as a neural model differentiable end-to-end, rather than as a set of rules or constraints.
no code implementations • EMNLP 2020 • Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš, Anna Korhonen
The success of large pretrained language models (LMs) such as BERT and RoBERTa has sparked interest in probing their representations, in order to unveil what types of knowledge they implicitly capture.
1 code implementation • EMNLP 2020 • Edoardo Maria Ponti, Goran Glavaš, Olga Majewska, Qianchu Liu, Ivan Vulić, Anna Korhonen
In order to simulate human language capacity, natural language processing systems must be able to reason about the dynamics of everyday situations, including their possible causes and effects.
Ranked #3 on
Cross-Lingual Transfer
on XCOPA
(using extra training data)
no code implementations • Findings of the Association for Computational Linguistics 2020 • Diana Rodríguez Luna, Edoardo Maria Ponti, Dieuwke Hupkes, Elia Bruni
In previous work, artificial agents were shown to achieve almost perfect accuracy in referential games where they have to communicate to identify images.
no code implementations • 10 Mar 2020 • Ivan Vulić, Simon Baker, Edoardo Maria Ponti, Ulla Petti, Ira Leviant, Kelly Wing, Olga Majewska, Eden Bar, Matt Malone, Thierry Poibeau, Roi Reichart, Anna Korhonen
We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering datasets for 12 typologically diverse languages, including major languages (e. g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e. g., Welsh, Kiswahili).
no code implementations • IJCNLP 2019 • Edoardo Maria Ponti, Ivan Vuli{\'c}, Goran Glava{\v{s}}, Roi Reichart, Anna Korhonen
Semantic specialization integrates structured linguistic knowledge from external resources (such as lexical relations in WordNet) into pretrained distributional vectors in the form of constraints.
1 code implementation • COLING 2020 • Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti, Anna Korhonen, Goran Glavaš
In this work, we complement such distributional knowledge with external lexical knowledge, that is, we integrate the discrete knowledge on word-level semantic similarity into pretraining.
no code implementations • WS 2019 • Aishwarya Kamath, Jonas Pfeiffer, Edoardo Maria Ponti, Goran Glava{\v{s}}, Ivan Vuli{\'c}
Semantic specialization methods fine-tune distributional word vectors using lexical knowledge from external resources (e. g. WordNet) to accentuate a particular relation between words.
no code implementations • EMNLP 2018 • Daniela Gerz, Ivan Vuli{\'c}, Edoardo Maria Ponti, Roi Reichart, Anna Korhonen
A key challenge in cross-lingual NLP is developing general language-independent architectures that are equally applicable to any language.
1 code implementation • EMNLP 2018 • Edoardo Maria Ponti, Ivan Vulić, Goran Glavaš, Nikola Mrkšić, Anna Korhonen
Our adversarial post-specialization method propagates the external lexical knowledge to the full distributional space.
no code implementations • CL 2019 • Edoardo Maria Ponti, Helen O'Horan, Yevgeni Berzak, Ivan Vulić, Roi Reichart, Thierry Poibeau, Ekaterina Shutova, Anna Korhonen
Linguistic typology aims to capture structural and semantic variation across the world's languages.
no code implementations • ACL 2018 • Edoardo Maria Ponti, Roi Reichart, Anna Korhonen, Ivan Vuli{\'c}
The transfer or share of knowledge between languages is a potential solution to resource scarcity in NLP.
no code implementations • SEMEVAL 2017 • Edoardo Maria Ponti, Ivan Vulić, Anna Korhonen
Distributed representations of sentences have been developed recently to represent their meaning as real-valued vectors.
no code implementations • WS 2017 • Edoardo Maria Ponti, Anna Korhonen
Causal relations play a key role in information extraction and reasoning.
no code implementations • 3 Oct 2016 • Edoardo Maria Ponti, Elisabetta Jezek, Bernardo Magnini
Lexical sets contain the words filling an argument slot of a verb, and are in part determined by selectional preferences.
no code implementations • LREC 2016 • Edoardo Maria Ponti, Marco Passarotti
The Index Thomisticus Treebank is the largest available treebank for Latin; it contains Medieval Latin texts by Thomas Aquinas.