The Evolved Transformer

30 Jan 2019David R. SoChen LiangQuoc V. Le

Recent works have highlighted the strength of the Transformer architecture on sequence tasks while, at the same time, neural architecture search (NAS) has begun to outperform human-designed models. Our goal is to apply NAS to search for a better alternative to the Transformer... (read more)

PDF Abstract

Evaluation Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK COMPARE
Machine Translation WMT2014 English-French Evolved Transformer Base BLEU score 40.6 # 13
Machine Translation WMT2014 English-French Evolved Transformer Big BLEU score 41.3 # 11
Machine Translation WMT2014 English-German Evolved Transformer Base BLEU score 28.4 # 16
Machine Translation WMT2014 English-German Evolved Transformer Big BLEU score 29.3 # 8