no code implementations • WS 2019 • Hongyi Cui, Shohei Iida, Po-Hsuan Hung, Takehito Utsuro, Masaaki Nagata
Recently, the Transformer becomes a state-of-the-art architecture in the filed of neural machine translation (NMT).
no code implementations • ACL 2019 • Shohei Iida, Ryuichiro Kimura, Hongyi Cui, Po-Hsuan Hung, Takehito Utsuro, Masaaki Nagata
The first hop attention is the scaled dot-product attention which is the same attention mechanism used in the original Transformer.