|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets.
SOTA for Language Modelling on Text8 (using extra training data)
We show that generating English Wikipedia articles can be approached as a multi- document summarization of source documents.
Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP).
SOTA for Linguistic Acceptability on CoLA
COMMON SENSE REASONING COREFERENCE RESOLUTION DOCUMENT SUMMARIZATION LINGUISTIC ACCEPTABILITY MACHINE TRANSLATION NATURAL LANGUAGE INFERENCE QUESTION ANSWERING SEMANTIC TEXTUAL SIMILARITY SENTIMENT ANALYSIS TEXT CLASSIFICATION TRANSFER LEARNING WORD SENSE DISAMBIGUATION
BERT (Devlin et al., 2018), a pre-trained Transformer (Vaswani et al., 2017) model, has achieved ground-breaking performance on multiple NLP tasks.
#3 best model for Document Summarization on CNN / Daily Mail (using extra training data)
This paper presents a new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks.
SOTA for Text Summarization on GigaWord (using extra training data)
For abstractive summarization, we propose a new fine-tuning schedule which adopts different optimizers for the encoder and the decoder as a means of alleviating the mismatch between the two (the former is pretrained while the latter is not).
SOTA for Extractive Document Summarization on CNN / Daily Mail (using extra training data)
In this paper we conceptualize extractive summarization as a sentence ranking task and propose a novel training algorithm which globally optimizes the ROUGE evaluation metric through a reinforcement learning objective.
We introduce extreme summarization, a new single-document summarization task which does not favor extractive strategies and calls for an abstractive modeling approach.