Unsupervised Machine Translation
28 papers with code • 9 benchmarks • 4 datasets
Unsupervised machine translation is the task of doing machine translation without any translation resources at training time.
( Image credit: Phrase-Based & Neural Unsupervised Machine Translation )
By learning to reconstruct in both languages from this shared feature space, the model effectively learns to translate without using any labeled data.
Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs.
Pre-training and fine-tuning, e. g., BERT, have achieved great success in language understanding by transferring knowledge from rich-resource pre-training task to the low/zero-resource downstream tasks.
This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks.
Across all style transfer tasks, our approach yields substantial gains over state-of-the-art non-generative baselines, including the state-of-the-art unsupervised machine translation techniques that our approach generalizes.