no code implementations • NAACL 2022 • Niccolò Campolungo, Tommaso Pasini, Denis Emelin, Roberto Navigli
Recent studies have shed some light on a common pitfall of Neural Machine Translation (NMT) models, stemming from their struggle to disambiguate polysemous words without lapsing into their most frequently occurring senses in the training corpus. In this paper, we first provide a novel approach for automatically creating high-precision sense-annotated parallel corpora, and then put forward a specifically tailored fine-tuning strategy for exploiting these sense annotations during training without introducing any additional requirement at inference time. The use of explicit senses proved to be beneficial to reduce the disambiguation bias of a baseline NMT model, while, at the same time, leading our system to attain higher BLEU scores than its vanilla counterpart in 3 language pairs.
1 code implementation • EMNLP 2021 • Rexhina Blloshmi, Tommaso Pasini, Niccolò Campolungo, Somnath Banerjee, Roberto Navigli, Gabriella Pasi
With the advent of contextualized embeddings, attention towards neural ranking approaches for Information Retrieval increased considerably.
no code implementations • EMNLP 2020 • Bianca Scarlini, Tommaso Pasini, Roberto Navigli
Contextualized word embeddings have been employed effectively across several tasks in Natural Language Processing, as they have proved to carry useful semantic information.
Ranked #12 on
Word Sense Disambiguation
on Supervised:
no code implementations • 8 May 2024 • Tommaso Pasini, Alejo López-Ávila, Husam Quteineh, Gerasimos Lampouras, Jinhua Du, Yubing Wang, Ze Li, Yusen Sun
We propose a novel fine-tuning approach that prepends the rhyming word at the start of each lyric, which allows the critical rhyming decision to be made before the model commits to the content of the lyric (as during reverse language modeling), but maintains compatibility with the word order of regular PLMs as the lyric itself is still generated in left-to-right order.
1 code implementation • ACL 2022 • Ilias Chalkidis, Tommaso Pasini, Sheng Zhang, Letizia Tomada, Sebastian Felix Schwemer, Anders Søgaard
We present a benchmark suite of four datasets for evaluating the fairness of pre-trained language models and the techniques used to fine-tune them for downstream tasks.
no code implementations • NAACL 2021 • Iacer Calixto, Alessandro Raganato, Tommaso Pasini
Further adding extra languages lead to improvements in most tasks up to a certain point, but overall we found it non-trivial to scale improvements in model transferability by training on ever increasing amounts of Wikipedia languages.
1 code implementation • NAACL 2021 • Edoardo Barba, Tommaso Pasini, Roberto Navigli
By means of an extensive array of experiments, we show that ESC unleashes the full potential of our model, leading it to outdo all of its competitors and to set a new state of the art on the English WSD task.
Ranked #5 on
Word Sense Disambiguation
on Supervised:
1 code implementation • EMNLP 2020 • Alessandro Raganato, Tommaso Pasini, Jose Camacho-Collados, Mohammad Taher Pilehvar
The ability to correctly model distinct meanings of a word is crucial for the effectiveness of semantic representation techniques.
no code implementations • ACL 2020 • Tommaso Pasini, Federico Scozzafava, Bianca Scarlini
Knowing the Most Frequent Sense (MFS) of a word has been proved to help Word Sense Disambiguation (WSD) models significantly.
no code implementations • LREC 2020 • Bianca Scarlini, Tommaso Pasini, Roberto Navigli
This limits the range of action of deep-learning approaches, which today are at the base of any NLP task and are hungry for data.
no code implementations • ACL 2019 • Bianca Scarlini, Tommaso Pasini, Roberto Navigli
The well-known problem of knowledge acquisition is one of the biggest issues in Word Sense Disambiguation (WSD), where annotated data are still scarce in English and almost absent in other languages.
no code implementations • SEMEVAL 2018 • Jose Camacho-Collados, Claudio Delli Bovi, Luis Espinosa-Anke, Sergio Oramas, Tommaso Pasini, Enrico Santus, Vered Shwartz, Roberto Navigli, Horacio Saggion
This paper describes the SemEval 2018 Shared Task on Hypernym Discovery.
no code implementations • 12 May 2018 • Tommaso Pasini, Francesco Maria Elia, Roberto Navigli
We release to the community six large-scale sense-annotated datasets in multiple language to pave the way for supervised multilingual Word Sense Disambiguation.
no code implementations • LREC 2020 • Tommaso Pasini, Jose Camacho-Collados
Large sense-annotated datasets are increasingly necessary for training deep supervised systems in Word Sense Disambiguation.
no code implementations • EMNLP 2017 • Tommaso Pasini, Roberto Navigli
Annotating large numbers of sentences with senses is the heaviest requirement of current Word Sense Disambiguation.