no code implementations • EAMT 2022 • Gabriele Sarti, Arianna Bisazza
Neural machine translation (NMT) systems are nowadays essential components of professional translation workflows.
no code implementations • WMT (EMNLP) 2020 • Prajit Dhar, Arianna Bisazza, Gertjan van Noord
This paper describes our submission for the English-Tamil news translation task of WMT-2020.
1 code implementation • NAACL (PrivateNLP) 2021 • Sohyung Kim, Arianna Bisazza, Fatih Turkmen
We study the problem of domain adaptation in Neural Machine Translation (NMT) when domain-specific data cannot be shared due to confidentiality or copyright issues.
no code implementations • CL (ACL) 2022 • Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord
To address this, we propose a novel language adaptation approach by introducing contextual language adapters to a multilingual parser.
no code implementations • LREC 2022 • Prajit Dhar, Arianna Bisazza, Gertjan van Noord
We conduct our evaluation on four typologically diverse target MRLs, and find that PT-Inflect surpasses NMT systems trained only on parallel data.
no code implementations • ACL (WAT) 2021 • Prajit Dhar, Arianna Bisazza, Gertjan van Noord
Dravidian languages, such as Kannada and Tamil, are notoriously difficult to translate by state-of-the-art neural models.
1 code implementation • ACL (GEM) 2021 • Chunliu Wang, Rik van Noord, Arianna Bisazza, Johan Bos
We present an end-to-end neural approach to generate English sentences from formal meaning representations, Discourse Representation Structures (DRSs).
no code implementations • 25 Mar 2024 • Gaofei Shen, Michaela Watkins, Afra Alishahi, Arianna Bisazza, Grzegorz Chrupała
Interpretability research has shown that self-supervised Spoken Language Models (SLMs) encode a wide variety of features in human speech from the acoustic, phonetic, phonological, syntactic and semantic levels, to speaker characteristics.
1 code implementation • 16 Oct 2023 • Jirui Qi, Raquel Fernández, Arianna Bisazza
Finally, we conduct a case study on CLC when new factual associations are inserted in the PLMs via model editing.
2 code implementations • 2 Oct 2023 • Gabriele Sarti, Grzegorz Chrupała, Malvina Nissim, Arianna Bisazza
Establishing whether language models can use contextual information in a human-plausible way is important to ensure their trustworthiness in real-world settings.
1 code implementation • 30 May 2023 • Gaofei Shen, Afra Alishahi, Arianna Bisazza, Grzegorz Chrupała
Understanding which information is encoded in deep models of spoken and written language has been the focus of much research in recent years, as it is crucial for debugging and improving these architectures.
1 code implementation • 28 Feb 2023 • Lukas Edman, Gabriele Sarti, Antonio Toral, Gertjan van Noord, Arianna Bisazza
Pretrained character-level and byte-level language models have been shown to be competitive with popular subword models across a range of Natural Language Processing (NLP) tasks.
2 code implementations • 27 Feb 2023 • Gabriele Sarti, Nils Feldhus, Ludwig Sickert, Oskar van der Wal, Malvina Nissim, Arianna Bisazza
Past work in natural language processing interpretability focused mainly on popular classification tasks while largely overlooking generation settings, partly due to a lack of dedicated tools.
1 code implementation • 30 Jan 2023 • Yuchen Lian, Arianna Bisazza, Tessa Verhoef
Artificial learners often behave differently from human learners in the context of neural agent-based simulations of language emergence and change.
1 code implementation • 24 May 2022 • Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord, Sebastian Ruder
Massively multilingual models are promising for transfer learning across tasks and languages.
1 code implementation • 24 May 2022 • Gabriele Sarti, Arianna Bisazza, Ana Guerberof Arenas, Antonio Toral
We publicly release the complete dataset including all collected behavioral data, to foster new research on the translation capabilities of NMT systems for typologically diverse languages.
1 code implementation • ACL 2021 • Chunliu Wang, Rik van Noord, Arianna Bisazza, Johan Bos
Even with DRSs based on English, good results for Chinese are obtained.
2 code implementations • 13 Jul 2021 • Arianna Bisazza, Ahmet Üstün, Stephan Sportel
Identifying factors that make certain languages harder to model than others is essential to reach language equality in future Natural Language Processing technologies.
no code implementations • EMNLP 2021 • Yuchen Lian, Arianna Bisazza, Tessa Verhoef
Natural languages display a trade-off among different strategies to convey syntactic structure, such as word order or inflection.
1 code implementation • EMNLP 2020 • Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord
The resulting parser, UDapter, outperforms strong monolingual and multilingual baselines on the majority of both high-resource and low-resource (zero-shot) languages, showing the success of the proposed adaptation approach.
no code implementations • NoDaLiDa 2021 • Prajit Dhar, Arianna Bisazza
It is now established that modern neural language models can be successfully trained on multiple languages simultaneously without changes to the underlying architecture.
2 code implementations • 19 Dec 2019 • Wietse de Vries, Andreas van Cranenburgh, Arianna Bisazza, Tommaso Caselli, Gertjan van Noord, Malvina Nissim
The transformer-based pre-trained language model BERT has helped to improve state-of-the-art performance on many natural language processing (NLP) tasks.
Ranked #3 on Sentiment Analysis on DBRD
no code implementations • WS 2019 • Ke Tran, Arianna Bisazza
We investigate whether off-the-shelf deep bidirectional sentence representations trained on a massively multilingual corpus (multilingual BERT) enable the development of an unsupervised universal dependency parser.
no code implementations • WS 2018 • Prajit Dhar, Arianna Bisazza
Recent work has shown that neural models can be successfully trained on multiple languages simultaneously.
no code implementations • EMNLP 2018 • Arianna Bisazza, Clara Tump
Neural sequence-to-sequence models have proven very effective for machine translation, but at the expense of model interpretability.
1 code implementation • EMNLP 2018 • Ke Tran, Arianna Bisazza, Christof Monz
Recent work has shown that recurrent neural networks (RNNs) can implicitly capture and exploit hierarchical information when trained to solve common natural language processing tasks such as language modeling (Linzen et al., 2016) and neural machine translation (Shi et al., 2016).
1 code implementation • LREC 2018 • Marzieh Fadaee, Arianna Bisazza, Christof Monz
Neural Machine Translation (NMT) has been widely used in recent years with significant improvements for many language pairs.
1 code implementation • EMNLP 2017 • Marlies van der Wees, Arianna Bisazza, Christof Monz
Intelligent selection of training data has proven a successful technique to simultaneously increase training efficiency and translation performance for phrase-based machine translation (PBMT).
1 code implementation • ACL 2017 • Marzieh Fadaee, Arianna Bisazza, Christof Monz
The quality of a Neural Machine Translation system depends substantially on the availability of sizable parallel corpora.
Data Augmentation Low-Resource Neural Machine Translation +2
1 code implementation • ACL 2017 • Marzieh Fadaee, Arianna Bisazza, Christof Monz
Distributed word representations are widely used for modeling words in NLP tasks.
no code implementations • COLING 2016 • Marlies van der Wees, Arianna Bisazza, Christof Monz
Finally, we find that male speakers are harder to translate and use more vulgar language than female speakers, and that vulgarity is often not preserved during translation.
no code implementations • WS 2016 • Marlies van der Wees, Arianna Bisazza, Christof Monz
A major challenge for statistical machine translation (SMT) of Arabic-to-English user-generated text is the prevalence of text written in Arabizi, or Romanized Arabic.
no code implementations • EMNLP 2016 • Luisa Bentivogli, Arianna Bisazza, Mauro Cettolo, Marcello Federico
Within the field of Statistical Machine Translation (SMT), the neural approach (NMT) has recently emerged as the first technology able to challenge the long-standing dominance of phrase-based approaches (PBMT).
2 code implementations • NAACL 2016 • Ke Tran, Arianna Bisazza, Christof Monz
In this paper, we propose Recurrent Memory Network (RMN), a novel RNN architecture, that not only amplifies the power of RNN but also facilitates our understanding of its internal functioning and allows us to discover underlying patterns in data.
no code implementations • 17 Feb 2015 • Arianna Bisazza, Marcello Federico
Word reordering is one of the most difficult aspects of statistical machine translation (SMT), and an important factor of its quality and efficiency.
no code implementations • TACL 2013 • Arianna Bisazza, Marcello Federico
Defining the reordering search space is a crucial issue in phrase-based SMT between distant languages.