We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.
SOTA for Common Sense Reasoning on SWAG
In this paper, we propose a novel neural network model called RNN Encoder-Decoder that consists of two recurrent neural networks (RNN).
#23 best model for Machine Translation on WMT2014 English-French
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration.
Neural machine translation is a recently proposed approach to machine translation.
#13 best model for Machine Translation on IWSLT2015 German-English