6 dataset results for Story Generation AND English

A creative writing task where the input is 4 random sentences and the output should be a coherent passage with 4 paragraphs that end in the 4 input sentences respectively. Such a task is open-ended and exploratory, and challenges creative thinking as well as high-level planning.

18 PAPERS • NO BENCHMARKS YET

HANNA

HANNA (HANNA, a large annotated dataset of Human-ANnotated NArratives for ASG evaluation.)

HANNA, a large annotated dataset of Human-ANnotated NArratives for Automatic Story Generation (ASG) evaluation, has been designed for the benchmarking of automatic metrics for ASG. HANNA contains 1,056 stories generated from 96 prompts from the WritingPrompts dataset. Each prompt is linked to a human story and to 10 stories generated by different ASG systems. Each story was annotated on six human criteria (Relevance, Coherence, Empathy, Surprise, Engagement and Complexity) by three raters. HANNA also contains the scores produced by 72 automatic metrics on each story.

3 PAPERS • NO BENCHMARKS YET

MoviePlotEvents

MoviePlotEvents (CMU Movie Summary Corpus with Events)

A version of the CMU Movie Summary Corpus (http://www.cs.cmu.edu/~ark/personas/), which was originally scraped from plot summaries from Wikipedia, with some cleaning and sentences turned into events & sorted into "genres" (via LDA).

2 PAPERS • NO BENCHMARKS YET

Scifi TV Shows

Scifi TV Shows (Scifi TV Show Plot Summaries & Events)

A collection of long-running (80+ episodes) science fiction TV show synopses, scraped from Fandom.com wikis. Collected Nov 2017. Each episode is considered a "story".

1 PAPER • 1 BENCHMARK

TVRecap

TVRecap a story generation dataset that requires generating detailed TV show episode recaps from a brief summary and a set of documents describing the characters involved. Unlike other story generation datasets, TVRecap contains stories that are authored by professional screenwriters and that feature complex interactions among multiple characters. Generating stories in TVRecap requires drawing relevant information from the lengthy provided documents about characters based on the brief summary. In addition, by swapping the input and output, TVRecap can serve as a challenging testbed for abstractive summarization.

1 PAPER • 4 BENCHMARKS

Visual Writing Prompts

Hugging Face Datasets (New!) | Website | Github Repository | arXiv e-Print

1 PAPER • NO BENCHMARKS YET