2 dataset results for Text Generation AND German

The WikiText language modeling dataset is a collection of over 100 million tokens extracted from the set of verified Good and Featured articles on Wikipedia. The dataset is available under the Creative Commons Attribution-ShareAlike License.

815 PAPERS • 3 BENCHMARKS

Opusparcus

Opusparcus is a paraphrase corpus for six European languages: German, English, Finnish, French, Russian, and Swedish. The paraphrases are extracted from the OpenSubtitles2016 corpus, which contains subtitles from movies and TV shows.

15 PAPERS • NO BENCHMARKS YET

Datasets

2 dataset results for Text Generation AND German