Avoiding Spoilers in Fan Wikis of Episodic Fiction

20 Jun 2015  ·  Shawn M. Jones, Michael L. Nelson ·

A variety of fan-based wikis about episodic fiction (e.g., television shows, novels, movies) exist on the World Wide Web. These wikis provide a wealth of information about complex stories, but if readers are behind in their viewing they run the risk of encountering "spoilers" -- information that gives away key plot points before the intended time of the show's writers. Enterprising readers might browse the wiki in a web archive so as to view the page prior to a specific episode date and thereby avoid spoilers. Unfortunately, due to how web archives choose the "best" page, it is still possible to see spoilers (especially in sparse archives). In this paper we discuss how to use Memento to avoid spoilers. Memento uses TimeGates to determine which best archived page to give back to the user, currently using a minimum distance heuristic. We quantify how this heuristic is inadequate for avoiding spoilers, analyzing data collected from fan wikis and the Internet Archive. We create an algorithm for calculating the probability of encountering a spoiler in a given wiki article. We conduct an experiment with 16 wiki sites for popular television shows. We find that 38% of those pages are unavailable in the Internet Archive. We find that when accessing fan wiki pages in the Internet Archive there is as much as a 66% chance of encountering a spoiler. Using sample access logs from the Internet Archive, we find that 19% of actual requests to the Wayback Machine for wikia.com pages ended in spoilers. We suggest the use of a different minimum distance heuristic, minpast, for wikis, using the desired datetime as an upper bound.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper