NorQuAD: Norwegian Question Answering Dataset

3 May 2023  ·  Sardana Ivanova, Fredrik Aas Andreassen, Matias Jentoft, Sondre Wold, Lilja Øvrelid ·

In this paper we present NorQuAD: the first Norwegian question answering dataset for machine reading comprehension. The dataset consists of 4,752 manually created question-answer pairs. We here detail the data collection procedure and present statistics of the dataset. We also benchmark several multilingual and Norwegian monolingual language models on the dataset and compare them against human performance. The dataset will be made freely available.

PDF Abstract

Datasets


Introduced in the Paper:

NorQuAD

Used in the Paper:

SQuAD JaQuAD

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here