MultiQ is a multi-hop QA dataset for Russian, suitable for general open-domain question answering, information retrieval, and reading comprehension tasks.
Motivation
Question-answering has been an essential task in natural language processing and information retrieval. However, certain areas in QA remain quite challenging for modern approaches, including the multi-hop one, which is traditionally considered an intersection of graph methods, knowledge representation, and SOTA language modeling.
Multi-hop reasoning has been the least addressed QA direction for Russian. The task is represented by the MuSeRC dataset (Fenogenova et al., 2020) and only a few dozen questions in SberQUAD (Efimov et al., 2020) and RuBQ (Rybin et al., 2021). In response, we have developed a semi-automatic pipeline for multi-hop dataset generation based on Wikidata.
An example in English for illustration purposes:
```{ 'support_text': 'Gerard McBurney (b. June 20, 1954, Cambridge) is a British arranger, musicologist, television and radio presenter, teacher, and writer. He was born in the family of American archaeologist Charles McBurney and secretary Anna Frances Edmonston, who combined English, Scottish and Irish roots. Gerard's brother Simon McBurney is an English actor, writer, and director. He studied at Cambridge and the Moscow State Conservatory with Edison Denisov and Roman Ledenev.', 'main_text': 'Simon Montague McBurney (born August 25, 1957, Cambridge) is an English actor, screenwriter, and director. Biography. Father is an American archaeologist who worked in the UK. Simon graduated from Cambridge with a degree in English Literature. After his father's death (1979) he moved to France, where he studied theater at the Jacques Lecoq Institute. In 1983 he created the theater company "Complicity". Actively works as an actor in film and television, and acts as a playwright and screenwriter.',
'question': 'Where was Gerard McBurney's brother born?',
'bridge_answers': [{'label': 'passage', 'length': 14, 'offset': 300, 'segment': 'Simon McBurney'}],
'main_answers': [{'label': 'passage', 'length': 9, 'offset': 47, 'segment': Cambridge'}],
'episode': [15],
'perturbation': 'multiq'
}```
Data Fields
Data Splits
The dataset consists of a training set with labeled examples and a test set in two configurations:
raw data: includes the original data with no additional sampling - episodes: data is split into evaluation episodes and includes several perturbations of test for robustness evaluation Test and train data sets are disjoint with respect to individual - questions, but may include overlaps in support and main texts.
Test Perturbations
Each training episode in the dataset corresponds to seven test variations, including the original test data and six adversarial test sets, acquired through the modification of the original test through the following text perturbations:
Paper | Code | Results | Date | Stars |
---|