Search Results for author: Olatz Perez-de-Viñaspre

Found 7 papers, 2 papers with code

Unsupervised Machine Translation in Real-World Scenarios

no code implementations LREC 2022 Ona de Gibert Bonet, Iakes Goenaga, Jordi Armengol-Estapé, Olatz Perez-de-Viñaspre, Carla Parra Escartín, Marina Sanchez, Mārcis Pinnis, Gorka Labaka, Maite Melero

In this work, we present the work that has been carried on in the MT4All CEF project and the resources that it has generated by leveraging recent research carried out in the field of unsupervised learning.

Translation Unsupervised Machine Translation

Ixamed’s submission description for WMT20 Biomedical shared task: benefits and limitations of using terminologies for domain adaptation

no code implementations WMT (EMNLP) 2020 Xabier Soto, Olatz Perez-de-Viñaspre, Gorka Labaka, Maite Oronoz

Regarding the techniques used, we base on the findings from our previous works for translating clinical texts into Basque, making use of clinical terminology for adapting the MT systems to the clinical domain.

Domain Adaptation Machine Translation

Comparing and combining tagging with different decoding algorithms for back-translation in NMT: learnings from a low resource scenario

no code implementations EAMT 2022 Xabier Soto, Olatz Perez-de-Viñaspre, Gorka Labaka, Maite Oronoz

Recently, diverse approaches have been proposed to get better automatic evaluation results of NMT models using back-translation, including the use of sampling instead of beam search as decoding algorithm for creating the synthetic corpus.

Machine Translation NMT +2

BasqueParl: A Bilingual Corpus of Basque Parliamentary Transcriptions

1 code implementation LREC 2022 Nayla Escribano, Jon Ander González, Julen Orbegozo-Terradillos, Ainara Larrondo-Ureta, Simón Peña-Fernández, Olatz Perez-de-Viñaspre, Rodrigo Agerri

Parliamentary transcripts provide a valuable resource to understand the reality and know about the most important facts that occur over time in our societies.

Does Corpus Quality Really Matter for Low-Resource Languages?

no code implementations15 Mar 2022 Mikel Artetxe, Itziar Aldabe, Rodrigo Agerri, Olatz Perez-de-Viñaspre, Aitor Soroa

For instance, 66% of documents are rated as high-quality for EusCrawl, in contrast with <33% for both mC4 and CC100.

Representation Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.