Understanding Back-Translation at Scale

EMNLP 2018 Sergey Edunov • Myle Ott • Michael Auli • David Grangier

An effective method to improve neural machine translation with monolingual data is to augment the parallel training corpus with back-translations of target language sentences. This work broadens the understanding of back-translation and investigates a number of methods to generate synthetic source sentences. We find that in all but resource poor settings back-translations obtained via sampling or noised beam outputs are most effective.

Full paper

Evaluation


Task Dataset Model Metric name Metric value Global rank Compare
Machine Translation WMT2014 English-French Transformer Big + BT BLEU score 45.6 # 1
Machine Translation WMT2014 English-German Transformer Big + BT BLEU score 35.0 # 1