no code implementations • WMT (EMNLP) 2020 • Ricardo Rei, Craig Stewart, Ana C Farinha, Alon Lavie
We present the contribution of the Unbabel team to the WMT 2020 Shared Task on Metrics.
no code implementations • WMT (EMNLP) 2021 • Markus Freitag, Ricardo Rei, Nitika Mathur, Chi-kiu Lo, Craig Stewart, George Foster, Alon Lavie, Ondřej Bojar
Contrary to previous years’ editions, this year we acquired our own human ratings based on expert-based human evaluation via Multidimensional Quality Metrics (MQM).
no code implementations • WMT (EMNLP) 2021 • Chrysoula Zerva, Daan van Stigt, Ricardo Rei, Ana C Farinha, Pedro Ramos, José G. C. de Souza, Taisiya Glushkova, Miguel Vera, Fabio Kepler, André F. T. Martins
We present the joint contribution of IST and Unbabel to the WMT 2021 Shared Task on Quality Estimation.
1 code implementation • WMT (EMNLP) 2021 • Ricardo Rei, Ana C Farinha, Chrysoula Zerva, Daan van Stigt, Craig Stewart, Pedro Ramos, Taisiya Glushkova, André F. T. Martins, Alon Lavie
In this paper, we present the joint contribution of Unbabel and IST to the WMT 2021 Metrics Shared Task.
no code implementations • EAMT 2022 • Ricardo Rei, Ana C Farinha, José G.C. de Souza, Pedro G. Ramos, André F.T. Martins, Luisa Coheur, Alon Lavie
In recent years, several neural fine-tuned machine translation evaluation metrics such as COMET and BLEURT have been proposed.
no code implementations • EAMT 2022 • José G.C. de Souza, Ricardo Rei, Ana C. Farinha, Helena Moniz, André F. T. Martins
This paper presents QUARTZ, QUality-AwaRe machine Translation, a project led by Unbabel which aims at developing machine translation systems that are more robust and produce fewer critical errors.
no code implementations • 10 Oct 2024 • Sweta Agrawal, José G. C. de Souza, Ricardo Rei, António Farinhas, Gonçalo Faria, Patrick Fernandes, Nuno M Guerreiro, Andre Martins
Alignment with human preferences is an important step in developing accurate and safe large language models.
no code implementations • 30 Sep 2024 • Hippolyte Gisserot-Boukhlef, Ricardo Rei, Emmanuel Malherbe, Céline Hudelot, Pierre Colombo, Nuno M. Guerreiro
Neural metrics for machine translation (MT) evaluation have become increasingly prominent due to their superior correlation with human judgments compared to traditional lexical metrics.
no code implementations • 24 Sep 2024 • Pedro Henrique Martins, Patrick Fernandes, João Alves, Nuno M. Guerreiro, Ricardo Rei, Duarte M. Alves, José Pombal, Amin Farajian, Manuel Faysse, Mateusz Klimaszewski, Pierre Colombo, Barry Haddow, José G. C. de Souza, Alexandra Birch, André F. T. Martins
The quality of open-weight LLMs has seen significant improvement, yet they remain predominantly focused on English.
no code implementations • 27 Jun 2024 • Marcos Treviso, Nuno M. Guerreiro, Sweta Agrawal, Ricardo Rei, José Pombal, Tania Vaz, Helena Wu, Beatriz Silva, Daan van Stigt, André F. T. Martins
While machine translation (MT) systems are achieving increasingly strong performance on benchmarks, they often produce translations with errors and anomalies.
1 code implementation • 28 May 2024 • Gonçalo R. A. Faria, Sweta Agrawal, António Farinhas, Ricardo Rei, José G. C. de Souza, André F. T. Martins
An important challenge in machine translation (MT) is to generate high-quality and diverse translations.
no code implementations • 28 May 2024 • Sweta Agrawal, António Farinhas, Ricardo Rei, André F. T. Martins
Automatic metrics for evaluating translation quality are typically validated by measuring how well they correlate with human assessments.
1 code implementation • 13 Mar 2024 • Sweta Agrawal, Amin Farajian, Patrick Fernandes, Ricardo Rei, André F. T. Martins
Our findings show that augmenting neural learned metrics with contextual information helps improve correlation with human judgments in the reference-free scenario and when evaluating translations in out-of-English settings.
2 code implementations • 27 Feb 2024 • Duarte M. Alves, José Pombal, Nuno M. Guerreiro, Pedro H. Martins, João Alves, Amin Farajian, Ben Peters, Ricardo Rei, Patrick Fernandes, Sweta Agrawal, Pierre Colombo, José G. C. de Souza, André F. T. Martins
While general-purpose large language models (LLMs) demonstrate proficiency on multiple tasks within the domain of translation, approaches based on open LLMs are competitive only when specializing on a single task.
1 code implementation • 1 Feb 2024 • Manuel Faysse, Patrick Fernandes, Nuno M. Guerreiro, António Loison, Duarte M. Alves, Caio Corro, Nicolas Boizard, João Alves, Ricardo Rei, Pedro H. Martins, Antoni Bigata Casademunt, François Yvon, André F. T. Martins, Gautier Viaud, Céline Hudelot, Pierre Colombo
We introduce CroissantLLM, a 1. 3B language model pretrained on a set of 3T English and French tokens, to bring to the research and industrial community a high-performance, fully open-sourced bilingual model that runs swiftly on consumer-grade local hardware.
no code implementations • 16 Nov 2023 • Jiayi Wang, David Ifeoluwa Adelani, Sweta Agrawal, Marek Masiak, Ricardo Rei, Eleftheria Briakou, Marine Carpuat, Xuanli He, Sofia Bourhim, Andiswa Bukula, Muhidin Mohamed, Temitayo Olatoye, Tosin Adewumi, Hamam Mokayed, Christine Mwase, Wangui Kimotho, Foutse Yuehgoh, Anuoluwapo Aremu, Jessica Ojo, Shamsuddeen Hassan Muhammad, Salomey Osei, Abdul-Hakeem Omotayo, Chiamaka Chukwuneke, Perez Ogayo, Oumaima Hourrane, Salma El Anigri, Lolwethu Ndolela, Thabiso Mangwana, Shafie Abdi Mohamed, Ayinde Hassan, Oluwabusayo Olufunke Awoyomi, Lama Alkhaled, sana al-azzawi, Naome A. Etori, Millicent Ochieng, Clemencia Siro, Samuel Njoroge, Eric Muchiri, Wangari Kimotho, Lyse Naomi Wamba Momo, Daud Abolade, Simbiat Ajao, Iyanuoluwa Shode, Ricky Macharm, Ruqayya Nasir Iro, Saheed S. Abdullahi, Stephen E. Moore, Bernard Opoku, Zainab Akinjobi, Abeeb Afolabi, Nnaemeka Obiefuna, Onyekachi Raphael Ogbu, Sam Brian, Verrah Akinyi Otiende, Chinedu Emmanuel Mbonu, Sakayo Toadoum Sari, Yao Lu, Pontus Stenetorp
Despite the recent progress on scaling multilingual machine translation (MT) to several under-resourced African languages, accurately measuring this progress remains challenging, since evaluation is often performed on n-gram matching metrics such as BLEU, which typically show a weaker correlation with human judgments.
no code implementations • 20 Oct 2023 • Duarte M. Alves, Nuno M. Guerreiro, João Alves, José Pombal, Ricardo Rei, José G. C. de Souza, Pierre Colombo, André F. T. Martins
Experiments on 10 language pairs show that our proposed approach recovers the original few-shot capabilities while keeping the added benefits of finetuning.
2 code implementations • 16 Oct 2023 • Nuno M. Guerreiro, Ricardo Rei, Daan van Stigt, Luisa Coheur, Pierre Colombo, André F. T. Martins
Widely used learned metrics for machine translation evaluation, such as COMET and BLEURT, estimate the quality of a translation hypothesis by providing a single sentence-level score.
1 code implementation • 21 Sep 2023 • Ricardo Rei, Nuno M. Guerreiro, José Pombal, Daan van Stigt, Marcos Treviso, Luisa Coheur, José G. C. de Souza, André F. T. Martins
Our team participated on all tasks: sentence- and word-level quality prediction (task 1) and fine-grained error span detection (task 2).
1 code implementation • 19 May 2023 • Ricardo Rei, Nuno M. Guerreiro, Marcos Treviso, Luisa Coheur, Alon Lavie, André F. T. Martins
Neural metrics for machine translation evaluation, such as COMET, exhibit significant improvements in their correlation with human judgments, as compared to traditional metrics based on lexical overlap, such as BLEU.
1 code implementation • 13 Sep 2022 • Ricardo Rei, Marcos Treviso, Nuno M. Guerreiro, Chrysoula Zerva, Ana C. Farinha, Christine Maroti, José G. C. de Souza, Taisiya Glushkova, Duarte M. Alves, Alon Lavie, Luisa Coheur, André F. T. Martins
We present the joint contribution of IST and Unbabel to the WMT 2022 Shared Task on Quality Estimation (QE).
no code implementations • 24 Jul 2022 • Isabel Dias, Ricardo Rei, Patrícia Pereira, Luisa Coheur
In this paper, we propose an end-to-end sentiment-aware conversational agent based on two models: a reply sentiment prediction model, which leverages the context of the dialogue to predict an appropriate sentiment for the agent to express in its reply; and a text generation model, which is conditioned on the predicted sentiment and the context of the dialogue, to produce a reply that is both context and sentiment appropriate.
1 code implementation • NAACL 2022 • Patrick Fernandes, António Farinhas, Ricardo Rei, José G. C. de Souza, Perez Ogayo, Graham Neubig, André F. T. Martins
Despite the progress in machine translation quality estimation and evaluation in the last years, decoding in neural machine translation (NMT) is mostly oblivious to this and centers around finding the most probable translation according to the model (MAP decoding), approximated with beam search.
1 code implementation • 13 Apr 2022 • Chrysoula Zerva, Taisiya Glushkova, Ricardo Rei, André F. T. Martins
Trainable evaluation metrics for machine translation (MT) exhibit strong correlation with human judgements, but they are often hard to interpret and might produce unreliable scores under noisy or out-of-domain data.
1 code implementation • 9 Mar 2022 • Vânia Mendonça, Ricardo Rei, Luisa Coheur, Alberto Sardinha
Moreover, since we not know in advance which query strategy will be the most adequate for a certain language pair and set of Machine Translation models, we propose to dynamically combine multiple strategies using prediction with expert advice.
2 code implementations • Findings (EMNLP) 2021 • Taisiya Glushkova, Chrysoula Zerva, Ricardo Rei, André F. T. Martins
Several neural-based metrics have been recently proposed to evaluate machine translation quality.
no code implementations • ACL 2021 • Ricardo Rei, Ana C Farinha, Craig Stewart, Luisa Coheur, Alon Lavie
We present MT-Telescope, a visualization platform designed to facilitate comparative analysis of the output quality of two Machine Translation (MT) systems.
1 code implementation • ACL 2021 • Vânia Mendonça, Ricardo Rei, Luisa Coheur, Alberto Sardinha, Ana Lúcia Santos
In Machine Translation, assessing the quality of a large amount of automatic translations can be challenging.
1 code implementation • EACL 2021 • Bruno Jardim, Ricardo Rei, Mariana S. C. Almeida
The segmentation of emails into functional zones (also dubbed email zoning) is a relevant preprocessing step for most NLP tasks that deal with emails.
1 code implementation • 29 Oct 2020 • Ricardo Rei, Craig Stewart, Catarina Farinha, Alon Lavie
Overall, our systems achieve strong results for all language pairs on previous test sets and in many cases set a new state-of-the-art.
1 code implementation • EMNLP 2020 • Ricardo Rei, Craig Stewart, Ana C Farinha, Alon Lavie
We present COMET, a neural framework for training multilingual machine translation evaluation models which obtains new state-of-the-art levels of correlation with human judgements.
1 code implementation • 27 Aug 2020 • Jose David Bermudez Castro, Ricardo Rei, Jose E. Ruiz, Pedro Achanccaray Diaz, Smith Arauco Canchumuni, Cristian Muñoz Villalobos, Felipe Borges Coelho, Leonardo Forero Mendoza, Marco Aurelio C. Pacheco
This work provides a fast detection system of COVID-19 characteristics in X-Ray images based on deep learning (DL) techniques.