Search Results for author: Antonio Toral

Found 68 papers, 26 papers with code

Using Wordnet to Improve Reordering in Hierarchical Phrase-Based Statistical Machine Translation

no code implementations GWC 2016 Arefeh Kazemi, Antonio Toral, Andy Way

We propose the use of WordNet synsets in a syntax-based reordering model for hierarchical statistical machine translation (HPB-SMT) to enable the model to generalize to phrases not seen in the training data but that have equivalent meaning.

Machine Translation Translation

Data Selection for Unsupervised Translation of German–Upper Sorbian

no code implementations WMT (EMNLP) 2020 Lukas Edman, Antonio Toral, Gertjan van Noord

This paper describes the methods behind the systems submitted by the University of Groningen for the WMT 2020 Unsupervised Machine Translation task for German–Upper Sorbian.

Translation Unsupervised Machine Translation

CREAMT: Creativity and narrative engagement of literary texts translated by translators and NMT

no code implementations EAMT 2022 Ana Guerberof Arenas, Antonio Toral

We present here the EU-funded project CREAMT that seeks to understand what is meant by creativity in different translation modalities, e. g. machine translation, post-editing or professional translation.

Machine Translation NMT +1

MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages

no code implementations EAMT 2022 Marta Bañón, Miquel Esplà-Gomis, Mikel L. Forcada, Cristian García-Romero, Taja Kuzman, Nikola Ljubešić, Rik van Noord, Leopoldo Pla Sempere, Gema Ramírez-Sánchez, Peter Rupnik, Vít Suchomel, Antonio Toral, Tobias van der Werff, Jaume Zaragoza

We introduce the project “MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages”, funded by the Connecting Europe Facility, which is aimed at building monolingual and parallel corpora for under-resourced European languages.

Low-Resource Unsupervised NMT: Diagnosing the Problem and Providing a Linguistically Motivated Solution

1 code implementation EAMT 2020 Lukas Edman, Antonio Toral, Gertjan van Noord

Unsupervised Machine Translation has been advancing our ability to translate without parallel data, but state-of-the-art methods assume an abundance of monolingual data.

NMT Translation +2

Unsupervised Translation of German–Lower Sorbian: Exploring Training and Novel Transfer Methods on a Low-Resource Language

no code implementations WMT (EMNLP) 2021 Lukas Edman, Ahmet Üstün, Antonio Toral, Gertjan van Noord

This paper describes the methods behind the systems submitted by the University of Groningen for the WMT 2021 Unsupervised Machine Translation task for German–Lower Sorbian (DE–DSB): a high-resource language to a low-resource one.

Decoder Translation +1

Multi-perspective Alignment for Increasing Naturalness in Neural Machine Translation

no code implementations11 Dec 2024 Huiyuan Lai, Esther Ploeger, Rik van Noord, Antonio Toral

Neural machine translation (NMT) systems amplify lexical biases present in their training data, leading to artificially impoverished language in output translations.

Towards Tailored Recovery of Lexical Diversity in Literary Machine Translation

no code implementations30 Aug 2024 Esther Ploeger, Huiyuan Lai, Rik van Noord, Antonio Toral

Thus, rather than aiming for the rigid increase of lexical diversity, we reframe the task as recovering what is lost in the machine translation process.

Diversity Machine Translation +1

Multilingual Multi-Figurative Language Detection

1 code implementation31 May 2023 Huiyuan Lai, Antonio Toral, Malvina Nissim

Figures of speech help people express abstract concepts and evoke stronger emotions than literal expressions, thereby making texts more creative and engaging.

Language Modelling Sentence

Multidimensional Evaluation for Text Style Transfer Using ChatGPT

1 code implementation26 Apr 2023 Huiyuan Lai, Antonio Toral, Malvina Nissim

We investigate the potential of ChatGPT as a multidimensional evaluator for the task of \emph{Text Style Transfer}, alongside, and in comparison to, existing automatic metrics as well as human judgements.

Style Transfer Text Style Transfer

Are Character-level Translations Worth the Wait? Comparing ByT5 and mT5 for Machine Translation

1 code implementation28 Feb 2023 Lukas Edman, Gabriele Sarti, Antonio Toral, Gertjan van Noord, Arianna Bisazza

Pretrained character-level and byte-level language models have been shown to be competitive with popular subword models across a range of Natural Language Processing (NLP) tasks.

Machine Translation NMT +1

Subword-Delimited Downsampling for Better Character-Level Translation

1 code implementation2 Dec 2022 Lukas Edman, Antonio Toral, Gertjan van Noord

This new downsampling method not only outperforms existing downsampling methods, showing that downsampling characters can be done without sacrificing quality, but also leads to promising performance compared to subword models for translation.

Machine Translation Translation

Patching Leaks in the Charformer for Efficient Character-Level Generation

1 code implementation27 May 2022 Lukas Edman, Antonio Toral, Gertjan van Noord

Character-based representations have important advantages over subword-based ones for morphologically rich languages.

Decoder NMT +1

DivEMT: Neural Machine Translation Post-Editing Effort Across Typologically Diverse Languages

1 code implementation24 May 2022 Gabriele Sarti, Arianna Bisazza, Ana Guerberof Arenas, Antonio Toral

We publicly release the complete dataset including all collected behavioral data, to foster new research on the translation capabilities of NMT systems for typologically diverse languages.

Machine Translation NMT +1

The Importance of Context in Very Low Resource Language Modeling

no code implementations ICON 2021 Lukas Edman, Antonio Toral, Gertjan van Noord

This paper investigates very low resource language model pretraining, when less than 100 thousand sentences are available.

Language Modelling POS +1

Human Judgement as a Compass to Navigate Automatic Metrics for Formality Transfer

1 code implementation HumEval (ACL) 2022 Huiyuan Lai, Jiali Mao, Antonio Toral, Malvina Nissim

Although text style transfer has witnessed rapid development in recent years, there is as yet no established standard for evaluation, which is performed using several automatic metrics, lacking the possibility of always resorting to human judgement.

Navigate Style Transfer +1

Creativity in translation: machine translation as a constraint for literary texts

no code implementations12 Apr 2022 Ana Guerberof Arenas, Antonio Toral

This article presents the results of a study involving the translation of a short story by Kurt Vonnegut from English to Catalan and Dutch using three modalities: machine-translation (MT), post-editing (PE) and translation without aid (HT).

Machine Translation Translation

Unsupervised Translation of German--Lower Sorbian: Exploring Training and Novel Transfer Methods on a Low-Resource Language

1 code implementation24 Sep 2021 Lukas Edman, Ahmet Üstün, Antonio Toral, Gertjan van Noord

Lastly, we experiment with the order in which offline and online back-translation are used to train an unsupervised system, finding that using online back-translation first works better for DE$\rightarrow$DSB by 2. 76 BLEU.

Decoder Translation +1

Machine Translation of Novels in the Age of Transformer

1 code implementation30 Nov 2020 Antonio Toral, Antoni Oliver, Pau Ribas Ballestín

In this chapter we build a machine translation (MT) system tailored to the literary domain, specifically to novels, based on the state-of-the-art architecture in neural MT (NMT), the Transformer (Vaswani et al., 2017), for the translation direction English-to-Catalan.

Machine Translation NMT +1

Character-level Representations Improve DRS-based Semantic Parsing Even in the Age of BERT

2 code implementations EMNLP 2020 Rik van Noord, Antonio Toral, Johan Bos

We combine character-level and contextual language model representations to improve performance on Discourse Representation Structure parsing.

DRS Parsing Language Modelling

Fine-grained Human Evaluation of Transformer and Recurrent Approaches to Neural Machine Translation for English-to-Chinese

1 code implementation EAMT 2020 Yuying Ye, Antonio Toral

This research presents a fine-grained human evaluation to compare the Transformer and recurrent approaches to neural machine translation (MT), on the translation direction English-to-Chinese.

Machine Translation NMT +1

Reassessing Claims of Human Parity and Super-Human Performance in Machine Translation at WMT 2019

1 code implementation EAMT 2020 Antonio Toral

We reassess the claims of human parity and super-human performance made at the news shared task of WMT 2019 for three translation directions: English-to-German, English-to-Russian and German-to-English.

Machine Translation Translation

A Set of Recommendations for Assessing Human-Machine Parity in Language Translation

1 code implementation3 Apr 2020 Samuel Läubli, Sheila Castilho, Graham Neubig, Rico Sennrich, Qinlan Shen, Antonio Toral

The quality of machine translation has increased remarkably over the past years, to the degree that it was found to be indistinguishable from professional human translation in a number of empirical investigations.

Machine Translation Translation

Neural Machine Translation for English--Kazakh with Morphological Segmentation and Synthetic Data

no code implementations WS 2019 Antonio Toral, Lukas Edman, Galiya Yeshmagambetova, Jennifer Spenader

This paper presents the systems submitted by the University of Groningen to the English{--} Kazakh language pair (both translation directions) for the WMT 2019 news translation task.

Machine Translation Translation

Post-editese: an Exacerbated Translationese

1 code implementation WS 2019 Antonio Toral

We conduct a set of computational analyses in which we compare PE against HT on three different datasets that cover five translation directions with measures that address different translation universals and laws of translation: simplification, normalisation and interference.

Machine Translation Translation

The Effect of Translationese in Machine Translation Test Sets

1 code implementation WS 2019 Mike Zhang, Antonio Toral

The effect of translationese has been studied in the field of machine translation (MT), mostly with respect to training data.

Machine Translation Translation

Exploring Neural Methods for Parsing Discourse Representation Structures

1 code implementation TACL 2018 Rik van Noord, Lasha Abzianidze, Antonio Toral, Johan Bos

Neural methods have had several recent successes in semantic parsing, though they have yet to face the challenge of producing meaning representations based on formal semantics.

DRS Parsing

Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation

1 code implementation WS 2018 Antonio Toral, Sheila Castilho, Ke Hu, Andy Way

We reassess a recent study (Hassan et al., 2018) that claimed that machine translation (MT) has reached human parity for the translation of news from Chinese into English, using pairwise ranking and considering three variables that were not taken into account in that previous study: the language in which the source side of the test set was originally written, the translation proficiency of the evaluators, and the provision of inter-sentential context.

Machine Translation Translation

Quantitative Fine-Grained Human Evaluation of Machine Translation Systems: a Case Study on English to Croatian

1 code implementation2 Feb 2018 Filip Klubička, Antonio Toral, Víctor M. Sánchez-Cartagena

This paper presents a quantitative fine-grained manual evaluation approach to comparing the performance of different machine translation (MT) systems.

Machine Translation Sentence +1

What Level of Quality can Neural Machine Translation Attain on Literary Text?

no code implementations15 Jan 2018 Antonio Toral, Andy Way

Given the rise of a new approach to MT, Neural MT (NMT), and its promising performance on different text types, we assess the translation quality it can attain on what is perceived to be the greatest challenge for MT: literary text.

Machine Translation NMT +1

Fine-grained human evaluation of neural versus phrase-based machine translation

1 code implementation14 Jun 2017 Filip Klubička, Antonio Toral, Víctor M. Sánchez-Cartagena

We compare three approaches to statistical machine translation (pure phrase-based, factored phrase-based and neural) by performing a fine-grained manual evaluation via error annotation of the systems' outputs.

Machine Translation Translation

Producing Monolingual and Parallel Web Corpora at the Same Time - SpiderLing and Bitextor's Love Affair

no code implementations LREC 2016 Nikola Ljube{\v{s}}i{\'c}, Miquel Espl{\`a}-Gomis, Antonio Toral, Sergio Ortiz Rojas, Filip Klubi{\v{c}}ka

This paper presents an approach for building large monolingual corpora and, at the same time, extracting parallel data by crawling the top-level domain of a given language of interest.

TweetMT: A Parallel Microblog Corpus

no code implementations LREC 2016 I{\~n}aki San Vicente, I{\~n}aki Alegr{\'\i}a, Cristina Espa{\~n}a-Bonet, Pablo Gamallo, Hugo Gon{\c{c}}alo Oliveira, Eva Mart{\'\i}nez Garcia, Antonio Toral, Arkaitz Zubiaga, Nora Aranberri

We introduce TweetMT, a parallel corpus of tweets in four language pairs that combine five languages (Spanish from/to Basque, Catalan, Galician and Portuguese), all of which have an official status in the Iberian Peninsula.

Machine Translation Translation

Enhancing Cross-border EU E-commerce through Machine Translation: Needed Language Resources, Challenges and Opportunities

no code implementations LREC 2016 Meritxell Fern{\'a}ndez Barrera, Vladimir Popescu, Antonio Toral, Federico Gaspari, Khalid Choukri

This paper discusses the role that statistical machine translation (SMT) can play in the development of cross-border EU e-commerce, by highlighting extant obstacles and identifying relevant technologies to overcome them.

Machine Translation Translation

Towards a User-Friendly Platform for Building Language Resources based on Web Services

no code implementations LREC 2012 Marc Poch, Antonio Toral, Olivier Hamon, Valeria Quochi, N{\'u}ria Bel

This paper presents the platform developed in the PANACEA project, a distributed factory that automates the stages involved in the acquisition, production, updating and maintenance of Language Resources required by Machine Translation and other Language Technologies.

Machine Translation Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.