Most current question answering datasets frame the task as reading comprehension where the question is about a paragraph or document and the answer often is a span in the document. The Machine Reading group at UCL also provides an overview of reading comprehension tasks.
|Trend||Dataset||Best Method||Paper title||Paper||Code||Compare|
With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.
Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.
SOTA for Question Answering on SQuAD2.0 dev (using extra training data)
Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets.
SOTA for Language Modelling on Text8 (using extra training data)
This paper proposes to tackle open- domain question answering using Wikipedia as the unique knowledge source: the answer to any factoid question is a text span in a Wikipedia article.
We show that training on the new data improves the accuracy of our Attention-Sum Reader model on the original CBT test data by a much larger margin than many recent attempts to improve the model architecture.
One long-term goal of machine learning research is to produce methods that are applicable to reasoning and natural language, in particular building an intelligent dialogue agent.
Experiments show that human performance is well above current state-of-the-art baseline systems, leaving plenty of room for the community to make improvements.
In this technical report, we adapt whole word masking in Chinese text, that masking the whole word instead of masking Chinese characters, which could bring another challenge in Masked Language Model (MLM) pre-training task.