no code implementations • AMTA 2022 • Nikita Teslenko Grygoryev, Mercedes Garcia Martinez, Francisco Casacuberta Nolla, Amando Estela Pastor, Manuel Herranz
In order to evaluate the quality of the NMT models, firstly, these models have been compared performing a quantitative analysis by means of several standard automatic metrics used in machine translation, and measuring the time spent and the amount of text generated for a good use in the language industry.
no code implementations • LEGAL (LREC) 2022 • Victoria Arranz, Khalid Choukri, Montse Cuadros, Aitor García Pablos, Lucie Gianola, Cyril Grouin, Manuel Herranz, Patrick Paroubek, Pierre Zweigenbaum
This paper presents the outcomes of the MAPA project, a set of annotated corpora for 24 languages of the European Union and an open-source customisable toolkit able to detect and substitute sensitive information in text documents from any domain, using state-of-the art, deep learning-based named entity recognition techniques.
no code implementations • EAMT 2020 • Ēriks Ajausks, Victoria Arranz, Laurent Bié, Aleix Cerdà-i-Cucó, Khalid Choukri, Montse Cuadros, Hans Degroote, Amando Estela, Thierry Etchegoyhen, Mercedes García-Martínez, Aitor García-Pablos, Manuel Herranz, Alejandro Kohan, Maite Melero, Mike Rosner, Roberts Rozis, Patrick Paroubek, Artūrs Vasiļevskis, Pierre Zweigenbaum
We describe the MAPA project, funded under the Connecting Europe Facility programme, whose goal is the development of an open-source de-identification toolkit for all official European Union languages.
no code implementations • EAMT 2022 • Eirini Kaldeli, Mercedes García-Martínez, Antoine Isaac, Paolo Sebastiano Scalia, Arne Stabenau, Iván Lena Almor, Carmen Grau Lacal, Martín Barroso Ordóñez, Amando Estela, Manuel Herranz
Europeana Translate is a project funded under the Connecting European Facility with the objective to take advantage of state-of-the-art machine translation in order to increase the multilinguality of resources in the cultural heritage domain
no code implementations • EAMT 2020 • Laurent Bié, Aleix Cerdà-i-Cucó, Hans Degroote, Amando Estela, Mercedes García-Martínez, Manuel Herranz, Alejandro Kohan, Maite Melero, Tony O’Dowd, Sinéad O’Gorman, Mārcis Pinnis, Roberts Rozis, Riccardo Superbo, Artūrs Vasiļevskis
The Neural Translation for the European Union (NTEU) project aims to build a neural engine farm with all European official language combinations for eTranslation, without the necessity to use a high-resourced language as a pivot.
no code implementations • EAMT 2020 • Miguel Domingo, Mercedes García-Martínez, Álvaro Peris, Alexandre Helle, Amando Estela, Laurent Bié, Francisco Casacuberta, Manuel Herranz
Adaptive neural machine translation systems, able to incrementally update the underlying models under an online learning regime, have been proven to be useful to improve the efficiency of this workflow.
no code implementations • 14 Nov 2022 • Francisco Casacuberta, Alexandru Ceausu, Khalid Choukri, Miltos Deligiannis, Miguel Domingo, Mercedes García-Martínez, Manuel Herranz, Guillaume Jacquet, Vassilis Papavassiliou, Stelios Piperidis, Prokopis Prokopidis, Dimitris Roussis, Marwa Hadj Salah
This work presents the results of the machine translation (MT) task from the Covid-19 MLIA @ Eval initiative, a community effort to improve the generation of MT systems focused on the current Covid-19 crisis.
no code implementations • LREC 2020 • Mercedes Garc{\'\i}a-Mart{\'\i}nez, Manuel Herranz, Am Estela, o, {\'A}ngela Franco, Laurent Bi{\'e}
Eco is Pangeanic{'}s customer portal for generic or specialized translation services (machine translation and post-editing, generic API MT and custom API MT).
no code implementations • WS 2019 • Sheila Castilho, Nat{\'a}lia Resende, Federico Gaspari, Andy Way, Tony O{'}Dowd, Marek Mazur, Manuel Herranz, Alex Helle, Gema Ram{\'\i}rez-S{\'a}nchez, V{\'\i}ctor S{\'a}nchez-Cartagena, M{\=a}rcis Pinnis, Valters {\v{S}}ics
no code implementations • WS 2019 • Miguel Domingo, Mercedes García-Martínez, Álvaro Peris, Alexandre Helle, Amando Estela, Laurent Bié, Francisco Casacuberta, Manuel Herranz
A common use of machine translation in the industry is providing initial translation hypotheses, which are later supervised and post-edited by a human expert.
no code implementations • 20 Dec 2018 • Miguel Domingo, Mercedes Garcıa-Martınez, Alexandre Helle, Francisco Casacuberta, Manuel Herranz
Separating punctuation and splitting tokens into words or subwords has proven to be helpful to reduce vocabulary and increase the number of examples of each word, improving the translation quality.