MultiQ

Introduced by Taktasheva et al. in TAPE: Assessing Few-shot Russian Language Understanding

MultiQ is a multi-hop QA dataset for Russian, suitable for general open-domain question answering, information retrieval, and reading comprehension tasks.

Motivation

Question-answering has been an essential task in natural language processing and information retrieval. However, certain areas in QA remain quite challenging for modern approaches, including the multi-hop one, which is traditionally considered an intersection of graph methods, knowledge representation, and SOTA language modeling.

Multi-hop reasoning has been the least addressed QA direction for Russian. The task is represented by the MuSeRC dataset (Fenogenova et al., 2020) and only a few dozen questions in SberQUAD (Efimov et al., 2020) and RuBQ (Rybin et al., 2021). In response, we have developed a semi-automatic pipeline for multi-hop dataset generation based on Wikidata.

An example in English for illustration purposes:

```{ 'support_text': 'Gerard McBurney (b. June 20, 1954, Cambridge) is a British arranger, musicologist, television and radio presenter, teacher, and writer. He was born in the family of American archaeologist Charles McBurney and secretary Anna Frances Edmonston, who combined English, Scottish and Irish roots. Gerard's brother Simon McBurney is an English actor, writer, and director. He studied at Cambridge and the Moscow State Conservatory with Edison Denisov and Roman Ledenev.', 'main_text': 'Simon Montague McBurney (born August 25, 1957, Cambridge) is an English actor, screenwriter, and director. Biography. Father is an American archaeologist who worked in the UK. Simon graduated from Cambridge with a degree in English Literature. After his father's death (1979) he moved to France, where he studied theater at the Jacques Lecoq Institute. In 1983 he created the theater company "Complicity". Actively works as an actor in film and television, and acts as a playwright and screenwriter.',

'question': 'Where was Gerard McBurney's brother born?',

'bridge_answers': [{'label': 'passage', 'length': 14, 'offset': 300, 'segment': 'Simon McBurney'}],

'main_answers': [{'label': 'passage', 'length': 9, 'offset': 47, 'segment': Cambridge'}],

'episode': [15],

'perturbation': 'multiq'

}```

Data Fields

  • question: a string containing the question text
  • support_text: a string containing the first text passage relating to the question
  • main_text: a string containing the main answer text
  • bridge_answers: a list of entities required to hop from the support text to the main text
  • main_answers: a list of answers to the question
  • perturbation: a string containing the name of the perturbation applied to text. If no perturbation was applied, the dataset name is used
  • episode: a list of episodes in which the instance is used. Only used for the train set

Data Splits

The dataset consists of a training set with labeled examples and a test set in two configurations:

raw data: includes the original data with no additional sampling - episodes: data is split into evaluation episodes and includes several perturbations of test for robustness evaluation Test and train data sets are disjoint with respect to individual - questions, but may include overlaps in support and main texts.

Test Perturbations

Each training episode in the dataset corresponds to seven test variations, including the original test data and six adversarial test sets, acquired through the modification of the original test through the following text perturbations:

  • ButterFingers: randomly adds noise to data by mimicking spelling mistakes made by humans through character swaps based on their keyboard distance
  • Emojify: replaces the input words with the corresponding emojis, preserving their original meaning
  • EDAdelete: randomly deletes tokens in the text
  • EDAswap: randomly swaps tokens in the text
  • BackTranslation: generates variations of the context through back-translation (ru -> en -> ru)
  • AddSent: generates an extra sentence at the end of the text

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


Modalities


Languages