Neural Machine Translation by Jointly Learning to Align and Translate

1 Sep 2014Dzmitry Bahdanau • Kyunghyun Cho • Yoshua Bengio

Neural machine translation is a recently proposed approach to machine translation. In this paper, we conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. Furthermore, qualitative analysis reveals that the (soft-)alignments found by the model agree well with our intuition.

Full paper

Evaluation


Task Dataset Model Metric name Metric value Global rank Compare
Machine Translation IWSLT2015 German-English Bi-GRU (MLE+SLE) BLEU score 28.53 # 13
Machine Translation WMT2014 English-French RNN-search50* BLEU score 36.15 # 19