A3-108 Machine Translation System for LoResMT Shared Task @MT Summit 2021 Conference

MTSummit 2021 · Saumitra Yadav, Manish Shrivastava ·

In this paper, we describe our submissions for LoResMT Shared Task @MT Summit 2021 Conference. We built statistical translation systems in each direction for English ⇐⇒ Marathi language pair. This paper outlines initial baseline experiments with various tokenization schemes to train models. Using optimal tokenization scheme we create synthetic data and further train augmented dataset to create more statistical models. Also, we reorder English to match Marathi syntax to further train another set of baseline and data augmented models using various tokenization schemes. We report configuration of the submitted systems and results produced by them.

PDF Abstract