no code implementations • IWSLT (EMNLP) 2018 • Víctor M. Sánchez-Cartagena
This paper presents Prompsit Language Engineering’s submission to the IWSLT 2018 Low Resource Machine Translation task.
no code implementations • WMT (EMNLP) 2020 • Miquel Esplà-Gomis, Víctor M. Sánchez-Cartagena, Jaume Zaragoza-Bernabeu, Felipe Sánchez-Martínez
This paper describes the joint submission of Universitat d’Alacant and Prompsit Language Engineering to the WMT 2020 shared task on parallel corpus filtering.
no code implementations • EAMT 2020 • Víctor M. Sánchez-Cartagena, Mikel L. Forcada, Felipe Sánchez-Martínez
Corpus-based approaches to machine translation (MT) have difficulties when the amount of parallel corpora to use for training is scarce, especially if the languages involved in the translation are highly inflected.
no code implementations • EAMT 2020 • Felipe Sánchez-Martínez, Víctor M. Sánchez-Cartagena, Juan Antonio Pérez-Ortiz, Mikel L. Forcada, Miquel Esplà-Gomis, Andrew Secker, Susie Coleman, Julie Wall
This paper describes our approach to create a neural machine translation system to translate between English and Swahili (both directions) in the news domain, as well as the process we followed to crawl the necessary parallel corpora from the Internet.
2 code implementations • 11 Apr 2024 • Andrés Lou, Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez, Víctor M. Sánchez-Cartagena
The Mayan languages comprise a language family with an ancient history, millions of speakers, and immense cultural value, that, nevertheless, remains severely underrepresented in terms of resources and global exposure.
1 code implementation • 29 Jan 2024 • Víctor M. Sánchez-Cartagena, Miquel Esplà-Gomis, Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez
When the amount of parallel sentences available to train a neural machine translation is scarce, a common practice is to generate new synthetic training samples from them.
no code implementations • 29 Jan 2024 • Víctor M. Sánchez-Cartagena, Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez
The study covers eight language pairs, different training corpus sizes, two architectures, and three types of annotation: dummy tags (with no linguistic information at all), part-of-speech tags, and morpho-syntactic description tags, which consist of part of speech and morphological features.
no code implementations • 29 Jan 2024 • Juan Ramón Rico-Juan, Víctor M. Sánchez-Cartagena, Jose J. Valero-Mas, Antonio Javier Gallego
Online Judge (OJ) systems are typically considered within programming-related courses as they yield fast and objective assessments of the code developed by the students.
Explainable artificial intelligence Explainable Artificial Intelligence (XAI)
1 code implementation • 16 Jan 2024 • Miquel Esplà-Gomis, Víctor M. Sánchez-Cartagena, Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez
The paper presents an automatic evaluation of these techniques on four language pairs that shows that our approach can successfully exploit monolingual texts in a TM-based CAT environment, increasing the amount of useful translation proposals, and that our neural model for estimating the post-editing effort enables the combination of translation proposals obtained from monolingual corpora and from TMs in the usual way.
1 code implementation • EMNLP 2021 • Víctor M. Sánchez-Cartagena, Miquel Esplà-Gomis, Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez
Many DA approaches aim at expanding the support of the empirical data distribution by generating new sentence pairs that contain infrequent words, thus making it closer to the true data distribution of parallel sentences.
Data Augmentation Low-Resource Neural Machine Translation +3
1 code implementation • 2 Feb 2018 • Filip Klubička, Antonio Toral, Víctor M. Sánchez-Cartagena
This paper presents a quantitative fine-grained manual evaluation approach to comparing the performance of different machine translation (MT) systems.
1 code implementation • 14 Jun 2017 • Filip Klubička, Antonio Toral, Víctor M. Sánchez-Cartagena
We compare three approaches to statistical machine translation (pure phrase-based, factored phrase-based and neural) by performing a fine-grained manual evaluation via error annotation of the systems' outputs.
1 code implementation • EACL 2017 • Antonio Toral, Víctor M. Sánchez-Cartagena
We aim to shed light on the strengths and weaknesses of the newly introduced neural machine translation paradigm.