Search Results for author: Sergio Ortiz Rojas

Found 7 papers, 3 papers with code

Bicleaner AI: Bicleaner Goes Neural

1 code implementation LREC 2022 Jaume Zaragoza-Bernabeu, Gema Ramírez-Sánchez, Marta Bañón, Sergio Ortiz Rojas

This paper describes the experiments carried out during the development of the latest version of Bicleaner, named Bicleaner AI, a tool that aims at detecting noisy sentences in parallel corpora.

Binary Classification Machine Translation +2

Human evaluation of web-crawled parallel corpora for machine translation

no code implementations HumEval (ACL) 2022 Gema Ramírez-Sánchez, Marta Bañón, Jaume Zaragoza-Bernabeu, Sergio Ortiz Rojas

Quality assessment has been an ongoing activity of the series of ParaCrawl efforts to crawl massive amounts of parallel data from multilingual websites for 29 languages.

Machine Translation Translation

Producing Monolingual and Parallel Web Corpora at the Same Time - SpiderLing and Bitextor's Love Affair

no code implementations LREC 2016 Nikola Ljube{\v{s}}i{\'c}, Miquel Espl{\`a}-Gomis, Antonio Toral, Sergio Ortiz Rojas, Filip Klubi{\v{c}}ka

This paper presents an approach for building large monolingual corpora and, at the same time, extracting parallel data by crawling the top-level domain of a given language of interest.

Cannot find the paper you are looking for? You can Submit a new open access paper.