Search Results for author: David Vilar

Found 21 papers, 6 papers with code

The taraX\"U corpus of human-annotated machine translations

no code implementations LREC 2014 Eleftherios Avramidis, Aljoscha Burchardt, Sabine Hunsicker, Maja Popovi{\'c}, Cindy Tscherwinka, David Vilar, Hans Uszkoreit

Human translators are the key to evaluating machine translation (MT) quality and also to addressing the so far unanswered question when and how to use MT in professional translation workflows.

General Classification Machine Translation +1

Sockeye: A Toolkit for Neural Machine Translation

16 code implementations15 Dec 2017 Felix Hieber, Tobias Domhan, Michael Denkowski, David Vilar, Artem Sokolov, Ann Clifton, Matt Post

Written in Python and built on MXNet, the toolkit offers scalable training and inference for the three most prominent encoder-decoder architectures: attentional recurrent neural networks, self-attentional transformers, and fully convolutional networks.

Machine Translation NMT +1

Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation

no code implementations NAACL 2018 Matt Post, David Vilar

The end-to-end nature of neural machine translation (NMT) removes many ways of manually guiding the translation process that were available in older paradigms.

Machine Translation NMT +1

Bandits Don't Follow Rules: Balancing Multi-Facet Machine Translation with Multi-Armed Bandits

no code implementations13 Oct 2021 Julia Kreutzer, David Vilar, Artem Sokolov

Training data for machine translation (MT) is often sourced from a multitude of large corpora that are multi-faceted in nature, e. g. containing contents from multiple domains or different levels of quality or complexity.

Machine Translation Multi-Armed Bandits +1

Scaling Up Influence Functions

2 code implementations6 Dec 2021 Andrea Schioppa, Polina Zablotskaia, David Vilar, Artem Sokolov

We address efficient calculation of influence functions for tracking predictions back to the training data.

Image Classification

Prompting PaLM for Translation: Assessing Strategies and Performance

no code implementations16 Nov 2022 David Vilar, Markus Freitag, Colin Cherry, Jiaming Luo, Viresh Ratnakar, George Foster

Large language models (LLMs) that have been trained on multilingual but not parallel text exhibit a remarkable ability to translate between languages.

Language Modelling Machine Translation +1

There's no Data Like Better Data: Using QE Metrics for MT Data Filtering

no code implementations9 Nov 2023 Jan-Thorsten Peter, David Vilar, Daniel Deutsch, Mara Finkelstein, Juraj Juraska, Markus Freitag

Quality Estimation (QE), the evaluation of machine translation output without the need of explicit references, has seen big improvements in the last years with the use of neural metrics.

Machine Translation NMT +2

Controlling Machine Translation for Multiple Attributes with Additive Interventions

no code implementations EMNLP 2021 Andrea Schioppa, David Vilar, Artem Sokolov, Katja Filippova

Fine-grained control of machine translation (MT) outputs along multiple attributes is critical for many modern MT applications and is a requirement for gaining users’ trust.

Attribute Machine Translation +2

Sockeye 2: A Toolkit for Neural Machine Translation

1 code implementation EAMT 2020 Felix Hieber, Tobias Domhan, Michael Denkowski, David Vilar

We present Sockeye 2, a modernized and streamlined version of the Sockeye neural machine translation (NMT) toolkit.

Machine Translation NMT +1

A Statistical Extension of Byte-Pair Encoding

1 code implementation ACL (IWSLT) 2021 David Vilar, Marcello Federico

Sub-word segmentation is currently a standard tool for training neural machine translation (MT) systems and other NLP tasks.

Data Compression Machine Translation +1

A Natural Diet: Towards Improving Naturalness of Machine Translation Output

no code implementations Findings (ACL) 2022 Markus Freitag, David Vilar, David Grangier, Colin Cherry, George Foster

In this work we propose a method for training MT systems to achieve a more natural style, i. e. mirroring the style of text originally written in the target language.

Machine Translation Sentence +1

Bandits Don’t Follow Rules: Balancing Multi-Facet Machine Translation with Multi-Armed Bandits

no code implementations Findings (EMNLP) 2021 Julia Kreutzer, David Vilar, Artem Sokolov

Training data for machine translation (MT) is often sourced from a multitude of large corpora that are multi-faceted in nature, e. g. containing contents from multiple domains or different levels of quality or complexity.

Machine Translation Multi-Armed Bandits +1

Cannot find the paper you are looking for? You can Submit a new open access paper.