In this paper, we propose a novel neural network model called RNN Encoder-Decoder that consists of two recurrent neural networks (RNN).
#22 best model for Machine Translation on WMT2014 English-French
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.
SOTA for Common Sense Reasoning on SWAG
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration.