|Trend||Dataset||Best Method||Paper title||Paper||Code||Compare|
We introduce a new language representation model called BERT, which stands for Bidirectional Encoder Representations from Transformers.
SOTA for Common Sense Reasoning on SWAG
With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.
Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets.
SOTA for Language Modelling on Text8 (using extra training data)
Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.
We introduce a new type of deep contextualized word representation that models both (1) complex characteristics of word use (e. g., syntax and semantics), and (2) how these uses vary across linguistic contexts (i. e., to model polysemy).
#3 best model for Sentiment Analysis on SST-5 Fine-grained classification
This paper proposes to tackle open- domain question answering using Wikipedia as the unique knowledge source: the answer to any factoid question is a text span in a Wikipedia article.
#115 best model for Question Answering on SQuAD1.1
Training large-scale question answering systems is complicated because training sources usually cover a small portion of the range of possible questions.
SOTA for Question Answering on WebQuestions
One long-term goal of machine learning research is to produce methods that are applicable to reasoning and natural language, in particular building an intelligent dialogue agent.
Common recurrent neural architectures scale poorly due to the intrinsic difficulty in parallelizing their state computations.
#10 best model for Machine Translation on WMT2014 English-German