The CICEROv2 dataset can be found in the data directory. Each line of the files is a json object indicating a single instance. The json objects have the following key-value pairs:
Key | Value |
---|---|
ID | Dialogue ID with dataset indicator. |
Dialogue | Utterances of the dialogue in a list. |
Target | Target utterance. |
Question | One of the five questions (inference types). |
Choices | Five possible answer choices in a list. One of the answers is human written. The other four answers are machine generated and selected through the Adversarial Filtering (AF) algorithm. |
Human Written Answer | Index of the human written answer in a single element list. Index starts from 0. |
Correct Answers | List of all correct answers indicated as plausible or speculatively correct by the human annotators. Includes the index of the human written answer. |
--------------------------------------------------------------------------- |
An example of the data is shown below.
{
"ID": "daily-dialogue-0404",
"Dialogue": [
"A: Dad , why are you taping the windows ?",
"B: Honey , a typhoon is coming .",
"A: Really ? Wow , I don't have to go to school tomorrow .",
"B: Jenny , come and help , we need to prepare more food .",
"A: OK . Dad ! I'm coming ."
],
"Target": "Jenny , come and help , we need to prepare more food .",
"Question": "What subsequent event happens or could happen following the target?",
"Choices": [
"Jenny and her father stockpile food for the coming days.",
"The speaker and the listener go outside to purchase more food material for precaution.",
"Jenny and her father give away all their food.",
"Jenny and her father eat all the food in their refrigerator."
],
"Correct Answers": [
0,
1
]
}
Paper | Code | Results | Date | Stars |
---|