Opusparcus is a paraphrase corpus for six European languages: German, English, Finnish, French, Russian, and Swedish. The paraphrases are extracted from the OpenSubtitles2016 corpus, which contains subtitles from movies and TV shows.
15 PAPERS • NO BENCHMARKS YET
Ice Hockey News Dataset is a corpus of Finnish ice hockey news, edited to be suitable for training of end-to-end news generation methods, as well as demonstrate generation of text, which was judged by journalists to be relatively close to a viable product.
1 PAPER • NO BENCHMARKS YET