The Evolved Transformer

30 Jan 2019David R. So • Chen Liang • Quoc V. Le

Recent works have highlighted the strengths of the Transformer architecture for dealing with sequence tasks. At the same time, neural architecture search has advanced to the point where it can outperform human-designed models. The goal of this work is to use architecture search to find a better Transformer architecture.

Full paper

Evaluation


Task Dataset Model Metric name Metric value Global rank Compare
Machine Translation WMT2014 English-French Evolved Transformer Base BLEU score 40.6 # 10
Machine Translation WMT2014 English-French Evolved Transformer Big BLEU score 41.3 # 7
Machine Translation WMT2014 English-German Evolved Transformer Base BLEU score 28.4 # 11
Machine Translation WMT2014 English-German Evolved Transformer Big BLEU score 29.3 # 5