Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions

ACL 2019 Philipp KoehnFrancisco Guzm{\'a}nVishrav ChaudharyJuan Pino

Following the WMT 2018 Shared Task on Parallel Corpus Filtering, we posed the challenge of assigning sentence-level quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub-selecting 2{\%} and 10{\%} of the highest-quality data to be used to train machine translation systems. This year, the task tackled the low resource condition of Nepali-English and Sinhala-English... (read more)

PDF Abstract

Code


No code implementations yet. Submit your code now

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.