Edinburgh Neural Machine Translation Systems for WMT 16

WS 2016  ยท  Rico Sennrich, Barry Haddow, Alexandra Birch ยท

We participated in the WMT 2016 shared news translation task by building neural translation systems for four language pairs, each trained in both directions: English<->Czech, English<->German, English<->Romanian and English<->Russian. Our systems are based on an attentional encoder-decoder, using BPE subword segmentation for open-vocabulary translation with a fixed vocabulary. We experimented with using automatic back-translations of the monolingual News corpus as additional training data, pervasive dropout, and target-bidirectional models. All reported methods give substantial improvements, and we see improvements of 4.3--11.2 BLEU over our baseline systems. In the human evaluation, our systems were the (tied) best constrained system for 7 out of 8 translation directions in which we participated.

PDF Abstract WS 2016 PDF WS 2016 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Machine Translation WMT2016 Czech-English Attentional encoder-decoder + BPE BLEU score 31.4 # 1
Machine Translation WMT2016 English-Czech Attentional encoder-decoder + BPE BLEU score 25.8 # 1
Machine Translation WMT2016 English-German Attentional encoder-decoder + BPE BLEU score 34.2 # 2
Machine Translation WMT2016 English-Romanian BiGRU BLEU score 28.1 # 13
Machine Translation WMT2016 English-Russian Attentional encoder-decoder + BPE BLEU score 26.0 # 1
Machine Translation WMT2016 German-English Attentional encoder-decoder + BPE BLEU score 38.6 # 3
Machine Translation WMT2016 Romanian-English Attentional encoder-decoder + BPE BLEU score 33.3 # 5
Machine Translation WMT2016 Russian-English Attentional encoder-decoder + BPE BLEU score 28.0 # 1

Methods