Search Results for author: Arianna Bisazza

Found 47 papers, 20 papers with code

InDeep × NMT: Empowering Human Translators via Interpretable Neural Machine Translation

no code implementations • EAMT 2022 • Gabriele Sarti, Arianna Bisazza

Neural machine translation (NMT) systems are nowadays essential components of professional translation workflows.

Machine Translation NMT +1

Paper
Add Code

Linguistically Motivated Subwords for English-Tamil Translation: University of Groningen’s Submission to WMT-2020

no code implementations • WMT (EMNLP) 2020 • Prajit Dhar, Arianna Bisazza, Gertjan van Noord

This paper describes our submission for the English-Tamil news translation task of WMT-2020.

Machine Translation NMT +1

Paper
Add Code

Using Confidential Data for Domain Adaptation of Neural Machine Translation

1 code implementation • NAACL (PrivateNLP) 2021 • Sohyung Kim, Arianna Bisazza, Fatih Turkmen

We study the problem of domain adaptation in Neural Machine Translation (NMT) when domain-specific data cannot be shared due to confidentiality or copyright issues.

Domain Adaptation Machine Translation +2

Paper
Code

UDapter: Typology-based Language Adapters for Multilingual Dependency Parsing and Sequence Labeling

no code implementations • CL (ACL) 2022 • Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord

To address this, we propose a novel language adaptation approach by introducing contextual language adapters to a multilingual parser.

Dependency Parsing Language Modelling +4

Paper
Add Code

Evaluating Pre-training Objectives for Low-Resource Translation into Morphologically Rich Languages

no code implementations • LREC 2022 • Prajit Dhar, Arianna Bisazza, Gertjan van Noord

We conduct our evaluation on four typologically diverse target MRLs, and find that PT-Inflect surpasses NMT systems trained only on parallel data.

Machine Translation NMT +1

Paper
Add Code

Optimal Word Segmentation for Neural Machine Translation into Dravidian Languages

no code implementations • ACL (WAT) 2021 • Prajit Dhar, Arianna Bisazza, Gertjan van Noord

Dravidian languages, such as Kannada and Tamil, are notoriously difficult to translate by state-of-the-art neural models.

Machine Translation Segmentation +1

Paper
Add Code

Evaluating Text Generation from Discourse Representation Structures

1 code implementation • ACL (GEM) 2021 • Chunliu Wang, Rik van Noord, Arianna Bisazza, Johan Bos

We present an end-to-end neural approach to generate English sentences from formal meaning representations, Discourse Representation Structures (DRSs).

Text Generation

Paper
Code

Encoding of lexical tone in self-supervised models of spoken language

no code implementations • 25 Mar 2024 • Gaofei Shen, Michaela Watkins, Afra Alishahi, Arianna Bisazza, Grzegorz Chrupała

Interpretability research has shown that self-supervised Spoken Language Models (SLMs) encode a wide variety of features in human speech from the acoustic, phonetic, phonological, syntactic and semantic levels, to speaker characteristics.

Paper
Add Code

Cross-Lingual Consistency of Factual Knowledge in Multilingual Language Models

1 code implementation • 16 Oct 2023 • Jirui Qi, Raquel Fernández, Arianna Bisazza

Finally, we conduct a case study on CLC when new factual associations are inserted in the PLMs via model editing.

Model Editing

Paper
Code

Quantifying the Plausibility of Context Reliance in Neural Machine Translation

2 code implementations • 2 Oct 2023 • Gabriele Sarti, Grzegorz Chrupała, Malvina Nissim, Arianna Bisazza

Establishing whether language models can use contextual information in a human-plausible way is important to ensure their trustworthiness in real-world settings.

Machine Translation Translation

289

Paper
Code

Wave to Syntax: Probing spoken language models for syntax

1 code implementation • 30 May 2023 • Gaofei Shen, Afra Alishahi, Arianna Bisazza, Grzegorz Chrupała

Understanding which information is encoded in deep models of spoken and written language has been the focus of much research in recent years, as it is crucial for debugging and improving these architectures.

Paper
Code

Are Character-level Translations Worth the Wait? Comparing ByT5 and mT5 for Machine Translation

1 code implementation • 28 Feb 2023 • Lukas Edman, Gabriele Sarti, Antonio Toral, Gertjan van Noord, Arianna Bisazza

Pretrained character-level and byte-level language models have been shown to be competitive with popular subword models across a range of Natural Language Processing (NLP) tasks.

Machine Translation NMT +1

Paper
Code

Inseq: An Interpretability Toolkit for Sequence Generation Models

2 code implementations • 27 Feb 2023 • Gabriele Sarti, Nils Feldhus, Ludwig Sickert, Oskar van der Wal, Malvina Nissim, Arianna Bisazza

Past work in natural language processing interpretability focused mainly on popular classification tasks while largely overlooking generation settings, partly due to a lack of dedicated tools.

Feature Importance Machine Translation +2

289

Paper
Code

Communication Drives the Emergence of Language Universals in Neural Agents: Evidence from the Word-order/Case-marking Trade-off

1 code implementation • 30 Jan 2023 • Yuchen Lian, Arianna Bisazza, Tessa Verhoef

Artificial learners often behave differently from human learners in the context of neural agent-based simulations of language emergence and change.

Paper
Code

Hyper-X: A Unified Hypernetwork for Multi-Task Multilingual Transfer

1 code implementation • 24 May 2022 • Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord, Sebastian Ruder

Massively multilingual models are promising for transfer learning across tasks and languages.

Transfer Learning

Paper
Code

DivEMT: Neural Machine Translation Post-Editing Effort Across Typologically Diverse Languages

1 code implementation • 24 May 2022 • Gabriele Sarti, Arianna Bisazza, Ana Guerberof Arenas, Antonio Toral

We publicly release the complete dataset including all collected behavioral data, to foster new research on the translation capabilities of NMT systems for typologically diverse languages.

Machine Translation NMT +1

Paper
Code

Input Representations for Parsing Discourse Representation Structures: Comparing English with Chinese

1 code implementation • ACL 2021 • Chunliu Wang, Rik van Noord, Arianna Bisazza, Johan Bos

Even with DRSs based on English, good results for Chinese are obtained.

Paper
Code

On the Difficulty of Translating Free-Order Case-Marking Languages

2 code implementations • 13 Jul 2021 • Arianna Bisazza, Ahmet Üstün, Stephan Sportel

Identifying factors that make certain languages harder to model than others is essential to reach language equality in future Natural Language Processing technologies.

Machine Translation NMT +1

Paper
Code

The Effect of Efficient Messaging and Input Variability on Neural-Agent Iterated Language Learning

no code implementations • EMNLP 2021 • Yuchen Lian, Arianna Bisazza, Tessa Verhoef

Natural languages display a trade-off among different strategies to convey syntactic structure, such as word order or inflection.

Paper
Add Code

UDapter: Language Adaptation for Truly Universal Dependency Parsing

1 code implementation • EMNLP 2020 • Ahmet Üstün, Arianna Bisazza, Gosse Bouma, Gertjan van Noord

The resulting parser, UDapter, outperforms strong monolingual and multilingual baselines on the majority of both high-resource and low-resource (zero-shot) languages, showing the success of the proposed adaptation approach.

Dependency Parsing Transfer Learning

Paper
Code

Understanding Cross-Lingual Syntactic Transfer in Multilingual Recurrent Neural Networks

no code implementations • NoDaLiDa 2021 • Prajit Dhar, Arianna Bisazza

It is now established that modern neural language models can be successfully trained on multiple languages simultaneously without changes to the underlying architecture.

Cross-Lingual Transfer

Paper
Add Code

BERTje: A Dutch BERT Model

2 code implementations • 19 Dec 2019 • Wietse de Vries, Andreas van Cranenburgh, Arianna Bisazza, Tommaso Caselli, Gertjan van Noord, Malvina Nissim

The transformer-based pre-trained language model BERT has helped to improve state-of-the-art performance on many natural language processing (NLP) tasks.

Ranked #3 on Sentiment Analysis on DBRD

Language Modelling named-entity-recognition +5

128

Paper
Code

Zero-shot Dependency Parsing with Pre-trained Multilingual Sentence Representations

no code implementations • WS 2019 • Ke Tran, Arianna Bisazza

We investigate whether off-the-shelf deep bidirectional sentence representations trained on a massively multilingual corpus (multilingual BERT) enable the development of an unsupervised universal dependency parser.

Dependency Parsing Sentence +1

Paper
Add Code

Does Syntactic Knowledge in Multilingual Language Models Transfer Across Languages?

no code implementations • WS 2018 • Prajit Dhar, Arianna Bisazza

Recent work has shown that neural models can be successfully trained on multiple languages simultaneously.

Language Acquisition Language Modelling

Paper
Add Code

The Lazy Encoder: A Fine-Grained Analysis of the Role of Morphology in Neural Machine Translation

no code implementations • EMNLP 2018 • Arianna Bisazza, Clara Tump

Neural sequence-to-sequence models have proven very effective for machine translation, but at the expense of model interpretability.

Machine Translation NMT +2

Paper
Add Code

Evaluation of Machine Translation Performance Across Multiple Genres and Languages

no code implementations • LREC 2018 • Marlies van der Wees, Arianna Bisazza, Christof Monz

Machine Translation Translation

Paper
Add Code

The Importance of Being Recurrent for Modeling Hierarchical Structure

1 code implementation • EMNLP 2018 • Ke Tran, Arianna Bisazza, Christof Monz

Recent work has shown that recurrent neural networks (RNNs) can implicitly capture and exploit hierarchical information when trained to solve common natural language processing tasks such as language modeling (Linzen et al., 2016) and neural machine translation (Shi et al., 2016).

Language Modelling Machine Translation +1

Paper
Code

Keynote: Unveiling the Linguistic Weaknesses of Neural MT

no code implementations • WS 2018 • Arianna Bisazza

Machine Translation

Paper
Add Code

Examining the Tip of the Iceberg: A Data Set for Idiom Translation

1 code implementation • LREC 2018 • Marzieh Fadaee, Arianna Bisazza, Christof Monz

Neural Machine Translation (NMT) has been widely used in recent years with significant improvements for many language pairs.

Machine Translation NMT +1

Paper
Code

Dynamic Data Selection for Neural Machine Translation

1 code implementation • EMNLP 2017 • Marlies van der Wees, Arianna Bisazza, Christof Monz

Intelligent selection of training data has proven a successful technique to simultaneously increase training efficiency and translation performance for phrase-based machine translation (PBMT).

Machine Translation NMT +1

Paper
Code

Data Augmentation for Low-Resource Neural Machine Translation

1 code implementation • ACL 2017 • Marzieh Fadaee, Arianna Bisazza, Christof Monz

The quality of a Neural Machine Translation system depends substantially on the availability of sizable parallel corpora.

Data Augmentation Low-Resource Neural Machine Translation +2

Paper
Code

Learning Topic-Sensitive Word Representations

1 code implementation • ACL 2017 • Marzieh Fadaee, Arianna Bisazza, Christof Monz

Distributed word representations are widely used for modeling words in NLP tasks.

Paper
Code

Measuring the Effect of Conversational Aspects on Machine Translation Quality

no code implementations • COLING 2016 • Marlies van der Wees, Arianna Bisazza, Christof Monz

Finally, we find that male speakers are harder to translate and use more vulgar language than female speakers, and that vulgarity is often not preserved during translation.

Machine Translation Translation

Paper
Add Code

A Simple but Effective Approach to Improve Arabizi-to-English Statistical Machine Translation

no code implementations • WS 2016 • Marlies van der Wees, Arianna Bisazza, Christof Monz

A major challenge for statistical machine translation (SMT) of Arabic-to-English user-generated text is the prevalence of text written in Arabizi, or Romanized Arabic.

Translation Transliteration

Paper
Add Code

Neural versus Phrase-Based Machine Translation Quality: a Case Study

no code implementations • EMNLP 2016 • Luisa Bentivogli, Arianna Bisazza, Mauro Cettolo, Marcello Federico

Within the field of Statistical Machine Translation (SMT), the neural approach (NMT) has recently emerged as the first technology able to challenge the long-standing dominance of phrase-based approaches (PBMT).

Machine Translation NMT +1

Paper
Add Code

Surveys: A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena

no code implementations • CL 2016 • Arianna Bisazza, Marcello Federico

Machine Translation Translation

Paper
Add Code

Recurrent Memory Networks for Language Modeling

2 code implementations • NAACL 2016 • Ke Tran, Arianna Bisazza, Christof Monz

In this paper, we propose Recurrent Memory Network (RMN), a novel RNN architecture, that not only amplifies the power of RNN but also facilitates our understanding of its internal functioning and allows us to discover underlying patterns in data.

Language Modelling Sentence +1