A Vietnamese Dataset for Evaluating Machine Reading Comprehension

30 Sep 2020  ·  Kiet Van Nguyen, Duc-Vu Nguyen, Anh Gia-Tuan Nguyen, Ngan Luu-Thuy Nguyen ·

Over 97 million people speak Vietnamese as their native language in the world. However, there are few research studies on machine reading comprehension (MRC) for Vietnamese, the task of understanding a text and answering questions related to it. Due to the lack of benchmark datasets for Vietnamese, we present the Vietnamese Question Answering Dataset (UIT-ViQuAD), a new dataset for the low-resource language as Vietnamese to evaluate MRC models. This dataset comprises over 23,000 human-generated question-answer pairs based on 5,109 passages of 174 Vietnamese articles from Wikipedia. In particular, we propose a new process of dataset creation for Vietnamese MRC. Our in-depth analyses illustrate that our dataset requires abilities beyond simple reasoning like word matching and demands single-sentence and multiple-sentence inferences. Besides, we conduct experiments on state-of-the-art MRC methods for English and Chinese as the first experimental models on UIT-ViQuAD. We also estimate human performance on the dataset and compare it to the experimental results of powerful machine learning models. As a result, the substantial differences between human performance and the best model performance on the dataset indicate that improvements can be made on UIT-ViQuAD in future research. Our dataset is freely available on our website to encourage the research community to overcome challenges in Vietnamese MRC.

PDF Abstract

Datasets


Introduced in the Paper:

UIT-ViQuAD

Used in the Paper:

SQuAD NewsQA CMRC KorQuAD
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Vietnamese Machine Reading Comprehension UIT-ViQuAD ViReader Avg F1 89.54 # 1
Vietnamese Machine Reading Comprehension UIT-ViQuAD DrQA Avg F1 63.44 # 5
Vietnamese Machine Reading Comprehension UIT-ViQuAD QANet Avg F1 68.06 # 4
Vietnamese Machine Reading Comprehension UIT-ViQuAD mBERT Avg F1 80.00 # 3
Vietnamese Machine Reading Comprehension UIT-ViQuAD XLM-R (Large) Avg F1 87.02 # 2

Methods