1 code implementation • ACL 2019 • Elena Voita, David Talbot, Fedor Moiseev, Rico Sennrich, Ivan Titov
Multi-head self-attention is a key component of the Transformer, a state-of-the-art architecture for neural machine translation.
Machine Translation Translation