An effective method to improve neural machine translation with monolingual data is to augment the parallel training corpus with back-translations of target language sentences. This work broadens the understanding of back-translation and investigates a number of methods to generate synthetic source sentences. We find that in all but resource poor settings back-translations obtained via sampling or noised beam outputs are most effective.
|Task||Dataset||Model||Metric name||Metric value||Global rank||Compare|
|Machine Translation||WMT2014 English-French||Transformer Big + BT||BLEU score||45.6||# 1|
|Machine Translation||WMT2014 English-German||Transformer Big + BT||BLEU score||35.0||# 1|