Browse > Natural Language Processing > Machine Translation > Unsupervised Machine Translation

Unsupervised Machine Translation

7 papers with code · Natural Language Processing
Subtask of Machine Translation

Unsupervised machine translation is the task of doing machine translation without any translation resources at training time.

State-of-the-art leaderboards

Greatest papers with code

Unsupervised Machine Translation Using Monolingual Corpora Only

ICLR 2018 facebookresearch/MUSE

Machine translation has recently achieved impressive performance thanks to recent advances in deep learning and the availability of large-scale parallel corpora. By learning to reconstruct in both languages from this shared feature space, the model effectively learns to translate without using any labeled data.

UNSUPERVISED MACHINE TRANSLATION

Word Translation Without Parallel Data

ICLR 2018 facebookresearch/MUSE

State-of-the-art methods for learning cross-lingual word embeddings have relied on bilingual dictionaries or parallel corpora. We finally describe experiments on the English-Esperanto low-resource language pair, on which there only exists a limited amount of parallel data, to show the potential impact of our method in fully unsupervised machine translation.

UNSUPERVISED MACHINE TRANSLATION WORD EMBEDDINGS

Phrase-Based & Neural Unsupervised Machine Translation

EMNLP 2018 facebookresearch/UnsupervisedMT

Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs. On low-resource languages like English-Urdu and English-Romanian, our methods achieve even better results than semi-supervised and supervised approaches leveraging the paucity of available bitexts.

UNSUPERVISED MACHINE TRANSLATION

Unsupervised Neural Machine Translation

ICLR 2018 rsennrich/subword-nmt

In spite of the recent success of neural machine translation (NMT) in standard benchmarks, the lack of large parallel corpora poses a major practical problem for many language pairs. There have been several proposals to alleviate this issue with, for instance, triangulation and semi-supervised learning techniques, but they still require a strong cross-lingual signal.

UNSUPERVISED MACHINE TRANSLATION

Cross-lingual Language Model Pretraining

22 Jan 2019facebookresearch/XLM

On unsupervised machine translation, we obtain 34.3 BLEU on WMT'16 German-English, improving the previous state of the art by more than 9 BLEU. On supervised machine translation, we obtain a new state of the art of 38.5 BLEU on WMT'16 Romanian-English, outperforming the previous best approach by more than 4 BLEU.

LANGUAGE MODELLING UNSUPERVISED MACHINE TRANSLATION

Unsupervised Statistical Machine Translation

EMNLP 2018 artetxem/vecmap

While modern machine translation has relied on large parallel corpora, a recent line of work has managed to train Neural Machine Translation (NMT) systems from monolingual corpora only (Artetxe et al., 2018c; Lample et al., 2018). Despite the potential of this approach for low-resource settings, existing systems are far behind their supervised counterparts, limiting their practical interest.

LANGUAGE MODELLING UNSUPERVISED MACHINE TRANSLATION

Unsupervised Neural Machine Translation with SMT as Posterior Regularization

14 Jan 2019Imagist-Shuo/UNMT-SPR

Without real bilingual corpus available, unsupervised Neural Machine Translation (NMT) typically requires pseudo parallel data generated with the back-translation method for the model training. To address this issue, we introduce phrase based Statistic Machine Translation (SMT) models which are robust to noisy data, as posterior regularizations to guide the training of unsupervised NMT models in the iterative back-translation process.

UNSUPERVISED MACHINE TRANSLATION