CSI Screenplay Summarization Corpus

Introduced by Papalampidi et al. in Screenplay Summarization Using Latent Narrative Structure

The dataset contains gold-standard summary labels for 39 "CSI: Crime Scene Investigation" episodes from seasons 1-5. Each episode contains the full-length screenplay and human annotations for its summary. The annotations include:

  1. scene-level binary labels denoting whether the scene belongs to the summary of the episode
  2. aspect-based labels for the scenes that belong to the summary, i.e., which aspect of the summary the scene addresses (e.g., information about the victim, the crime scene, the perpetrator etc.)
  3. sentence-level binary labels denoting the sentences of the screenplay that belong to the summary for 10 episodes of the dataset

Papers


Paper Code Results Date Stars

Dataset Loaders


Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages