Building a Better Bitext for Structurally Different Languages through Self-training

WS 2017  ·  Jungyeul Park, Lo{\"\i}c Dugast, Jeen-Pyo Hong, Chang-Uk Shin, Jeong-Won Cha ·

We propose a novel method to bootstrap the construction of parallel corpora for new pairs of structurally different languages. We do so by combining the use of a pivot language and self-training... A pivot language enables the use of existing translation models to bootstrap the alignment and a self-training procedure enables to achieve better alignment, both at the document and sentence level. We also propose several evaluation methods for the resulting alignment. read more

PDF Abstract


  Add Datasets introduced or used in this paper

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.


No methods listed for this paper. Add relevant methods here