Assess whether a sentence is grammatical or ungrammatical.
|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
While large language models \`a la BERT are used ubiquitously in NLP, pretraining them is considered a luxury that only a few well-funded industry labs can afford.
Ranked #15 on Semantic Textual Similarity on MRPC
Transformer is the backbone of modern NLP models.
Ranked #2 on Paraphrase Identification on Quora Question Pairs
While large-scale language models (LMs) are able to imitate the distribution of natural language well enough to generate realistic text, it is difficult to control which regions of the distribution they generate.
To remedy this, we propose, BigBird, a sparse attention mechanism that reduces this quadratic dependency to linear.
Ranked #1 on Text Classification on arXiv
As natural language processing methods are increasingly deployed in real-world scenarios such as healthcare, legal systems, and social science, it becomes necessary to recognize the role they potentially play in shaping social biases and stereotypes.
Humans read and write hundreds of billions of messages every day.
Ranked #18 on Natural Language Inference on RTE
Recent progress in pre-trained neural language models has significantly improved the performance of many natural language processing (NLP) tasks.
Ranked #1 on Question Answering on SQuAD1.1 dev
COMMON SENSE REASONING COREFERENCE RESOLUTION LINGUISTIC ACCEPTABILITY NAMED ENTITY RECOGNITION NATURAL LANGUAGE INFERENCE NATURAL LANGUAGE UNDERSTANDING QUESTION ANSWERING READING COMPREHENSION SEMANTIC TEXTUAL SIMILARITY SENTIMENT ANALYSIS WORD SENSE DISAMBIGUATION
We conduct a thorough study to diagnose the behaviors of pre-trained language encoders (ELMo, BERT, and RoBERTa) when confronted with natural grammatical errors.
The dot product self-attention is known to be central and indispensable to state-of-the-art Transformer models.
Ranked #1 on Semantic Textual Similarity on MRPC Dev