Edinburgh Neural Machine Translation Systems for WMT 16

WS 2016 Rico Sennrich • Barry Haddow • Alexandra Birch

We participated in the WMT 2016 shared news translation task by building neural translation systems for four language pairs, each trained in both directions: English<->Czech, English<->German, English<->Romanian and English<->Russian. Our systems are based on an attentional encoder-decoder, using BPE subword segmentation for open-vocabulary translation with a fixed vocabulary. We experimented with using automatic back-translations of the monolingual News corpus as additional training data, pervasive dropout, and target-bidirectional models.

Full paper

Evaluation


Task Dataset Model Metric name Metric value Global rank Compare
Machine Translation WMT2016 Czech-English Attentional encoder-decoder + BPE BLEU score 31.4 # 1
Machine Translation WMT2016 English-Czech Attentional encoder-decoder + BPE BLEU score 25.8 # 1
Machine Translation WMT2016 English-German Attentional encoder-decoder + BPE BLEU score 34.2 # 1
Machine Translation WMT2016 English-Romanian BiGRU BLEU score 28.1 # 5
Machine Translation WMT2016 English-Russian Attentional encoder-decoder + BPE BLEU score 26.0 # 1
Machine Translation WMT2016 German-English Attentional encoder-decoder + BPE BLEU score 38.6 # 1
Machine Translation WMT2016 Romanian-English Attentional encoder-decoder + BPE BLEU score 33.3 # 2
Machine Translation WMT2016 Russian-English Attentional encoder-decoder + BPE BLEU score 28.0 # 1