Edinburgh Neural Machine Translation Systems for WMT 16

WS 2016  ·  Rico Sennrich, Barry Haddow, Alexandra Birch ·

We participated in the WMT 2016 shared news translation task by building neural translation systems for four language pairs, each trained in both directions: English<->Czech, English<->German, English<->Romanian and English<->Russian. Our systems are based on an attentional encoder-decoder, using BPE subword segmentation for open-vocabulary translation with a fixed vocabulary. We experimented with using automatic back-translations of the monolingual News corpus as additional training data, pervasive dropout, and target-bidirectional models. All reported methods give substantial improvements, and we see improvements of 4.3--11.2 BLEU over our baseline systems. In the human evaluation, our systems were the (tied) best constrained system for 7 out of 8 translation directions in which we participated.

PDF Abstract WS 2016 PDF WS 2016 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Machine Translation WMT2016 Czech-English Attentional encoder-decoder + BPE BLEU score 31.4 # 1
Machine Translation WMT2016 English-Czech Attentional encoder-decoder + BPE BLEU score 25.8 # 1
Machine Translation WMT2016 English-German Attentional encoder-decoder + BPE BLEU score 34.2 # 2
Machine Translation WMT2016 English-Romanian BiGRU BLEU score 28.1 # 13
Machine Translation WMT2016 English-Russian Attentional encoder-decoder + BPE BLEU score 26.0 # 1
Machine Translation WMT2016 German-English Attentional encoder-decoder + BPE BLEU score 38.6 # 2
Machine Translation WMT2016 Romanian-English Attentional encoder-decoder + BPE BLEU score 33.3 # 4
Machine Translation WMT2016 Russian-English Attentional encoder-decoder + BPE BLEU score 28.0 # 1

Methods