3 dataset results for text2text-generation AND Texts

SQuAD (Stanford Question Answering Dataset)

The Stanford Question Answering Dataset (SQuAD) is a collection of question-answer pairs derived from Wikipedia articles. In SQuAD, the correct answers of questions can be any sequence of tokens in the given text. Because the questions and answers are produced by humans through crowdsourcing, it is more diverse than some other question-answering datasets. SQuAD 1.1 contains 107,785 question-answer pairs on 536 articles. SQuAD2.0 (open-domain SQuAD, SQuAD-Open), the latest version, combines the 100,000 questions in SQuAD1.1 with over 50,000 un-answerable questions written adversarially by crowdworkers in forms that are similar to the answerable ones.

1,918 PAPERS • 11 BENCHMARKS

ChatHaruhi (ChatHaruhi: Reviving Anime Character in Reality via Large Language Model)

ChatHaruhi is a dataset covering 32 Chinese / English TV / anime characters with over 54k simulated dialogues.

3 PAPERS • NO BENCHMARKS YET

ChatGPT Paraphrases

This is a dataset of paraphrases created by ChatGPT.

0 PAPER • NO BENCHMARKS YET

Datasets

3 dataset results for text2text-generation AND Texts