Unsupervised Machine Translation

28 papers with code • 9 benchmarks • 4 datasets

Unsupervised machine translation is the task of doing machine translation without any translation resources at training time.

( Image credit: Phrase-Based & Neural Unsupervised Machine Translation )


Use these libraries to find Unsupervised Machine Translation models and implementations

Most implemented papers

Language Models are Few-Shot Learners

openai/gpt-3 NeurIPS 2020

By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do.

Word Translation Without Parallel Data

facebookresearch/MUSE ICLR 2018

We finally describe experiments on the English-Esperanto low-resource language pair, on which there only exists a limited amount of parallel data, to show the potential impact of our method in fully unsupervised machine translation.

Unsupervised Machine Translation Using Monolingual Corpora Only

facebookresearch/MUSE ICLR 2018

By learning to reconstruct in both languages from this shared feature space, the model effectively learns to translate without using any labeled data.

Phrase-Based & Neural Unsupervised Machine Translation

facebookresearch/UnsupervisedMT EMNLP 2018

Machine translation systems achieve near human-level performance on some languages, yet their effectiveness strongly relies on the availability of large amounts of parallel sentences, which hinders their applicability to the majority of language pairs.

Cross-lingual Language Model Pretraining

huggingface/transformers NeurIPS 2019

On unsupervised machine translation, we obtain 34. 3 BLEU on WMT'16 German-English, improving the previous state of the art by more than 9 BLEU.

Unsupervised Translation of Programming Languages

facebookresearch/CodeGen NeurIPS 2020

We train our model on source code from open source GitHub projects, and show that it can translate functions between C++, Java, and Python with high accuracy.

MASS: Masked Sequence to Sequence Pre-training for Language Generation

microsoft/MASS 7 May 2019

Pre-training and fine-tuning, e. g., BERT, have achieved great success in language understanding by transferring knowledge from rich-resource pre-training task to the low/zero-resource downstream tasks.

Multilingual Denoising Pre-training for Neural Machine Translation

pytorch/fairseq 22 Jan 2020

This paper demonstrates that multilingual denoising pre-training produces significant performance gains across a wide variety of machine translation (MT) tasks.

A Probabilistic Formulation of Unsupervised Text Style Transfer

cindyxinyiwang/deep-latent-sequence-model ICLR 2020

Across all style transfer tasks, our approach yields substantial gains over state-of-the-art non-generative baselines, including the state-of-the-art unsupervised machine translation techniques that our approach generalizes.

Unsupervised Statistical Machine Translation

artetxem/vecmap EMNLP 2018

While modern machine translation has relied on large parallel corpora, a recent line of work has managed to train Neural Machine Translation (NMT) systems from monolingual corpora only (Artetxe et al., 2018c; Lample et al., 2018).