CMU Movie Summary Corpus

Introduced by Bamman et al. in Learning Latent Personas of Film Characters

Dataset [46 M] and readme: 42,306 movie plot summaries extracted from Wikipedia + aligned metadata extracted from Freebase, including: Movie box office revenue, genre, release date, runtime, and language Character names and aligned information about the actors who portray them, including gender and estimated age at the time of the movie's release Supplement: Stanford CoreNLP-processed summaries [628 M]. All of the plot summaries from above, run through the Stanford CoreNLP pipeline (tagging, parsing, NER and coref).


Paper Code Results Date Stars

Dataset Loaders

No data loaders found. You can submit your data loader here.


Similar Datasets