Search Results for author: Mikel L. Forcada

Found 23 papers, 2 papers with code

A multi-source approach for Breton–French hybrid machine translation

no code implementations EAMT 2020 Víctor M. Sánchez-Cartagena, Mikel L. Forcada, Felipe Sánchez-Martínez

Corpus-based approaches to machine translation (MT) have difficulties when the amount of parallel corpora to use for training is scarce, especially if the languages involved in the translation are highly inflected.

Data Augmentation Machine Translation +2

MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages

no code implementations EAMT 2022 Marta Bañón, Miquel Esplà-Gomis, Mikel L. Forcada, Cristian García-Romero, Taja Kuzman, Nikola Ljubešić, Rik van Noord, Leopoldo Pla Sempere, Gema Ramírez-Sánchez, Peter Rupnik, Vít Suchomel, Antonio Toral, Tobias van der Werff, Jaume Zaragoza

We introduce the project “MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages”, funded by the Connecting Europe Facility, which is aimed at building monolingual and parallel corpora for under-resourced European languages.

An English-Swahili parallel corpus and its use for neural machine translation in the news domain

no code implementations EAMT 2020 Felipe Sánchez-Martínez, Víctor M. Sánchez-Cartagena, Juan Antonio Pérez-Ortiz, Mikel L. Forcada, Miquel Esplà-Gomis, Andrew Secker, Susie Coleman, Julie Wall

This paper describes our approach to create a neural machine translation system to translate between English and Swahili (both directions) in the news domain, as well as the process we followed to crawl the necessary parallel corpora from the Internet.

Machine Translation Translation

Findings of the WMT 2018 Shared Task on Parallel Corpus Filtering

no code implementations WS 2018 Philipp Koehn, Huda Khayrallah, Kenneth Heafield, Mikel L. Forcada

We posed the shared task of assigning sentence-level quality scores for a very noisy corpus of sentence pairs crawled from the web, with the goal of sub-selecting 1{\%} and 10{\%} of high-quality data to be used to train machine translation systems.

Machine Translation Outlier Detection +2

A Maturity Model for Public Administration as Open Translation Data Providers

no code implementations7 Jul 2016 Núria Bel, Mikel L. Forcada, Asunción Gómez-Pérez

Any public administration that produces translation data can be a provider of useful reusable data to meet its own translation needs and the ones of other public organizations and private companies that work with texts of the same domain.

Machine Translation Management +1

A Light Sliding-Window Part-of-Speech Tagger for the Apertium Free/Open-Source Machine Translation Platform

no code implementations18 Sep 2015 Gang Chen, Mikel L. Forcada

This paper describes a free/open-source implementation of the light sliding-window (LSW) part-of-speech tagger for the Apertium free/open-source machine translation platform.

Machine Translation Translation

Inferring Shallow-Transfer Machine Translation Rules from Small Parallel Corpora

no code implementations15 Jan 2014 Felipe Sánchez-Martínez, Mikel L. Forcada

This paper describes a method for the automatic inference of structural transfer rules to be used in a shallow-transfer machine translation (MT) system from small parallel corpora.

Machine Translation Sentence +2

Cannot find the paper you are looking for? You can Submit a new open access paper.