CrowdPlay: Crowdsourcing human demonstration data for offline learning in Atari games

ICLR 2022 · Matthias Gerstgrasser, Rakshit Trivedi, David C. Parkes ·

Human demonstrations of video game play can serve as vital surrogate representations of real-world behaviors, access to which would facilitate rapid progress in several complex learning settings (e.g. behavior classification, imitation learning, offline RL etc.). The ability to build such human demonstration datasets is compelling as it alleviates the pitfalls associated with building simulators or synthetic data collection pipelines which are often impractical for real-world domains, either due to cost of building and running them or very large sim-to-real gap. Crowdsourcing has emerged as a standardized tool to collect useful datasets that require human participation. However, its applicability has been under-explored in the context of collecting human behaviors on sequential decision making tasks. To this end, we present CrowdPlay -- a complete crowdsourcing pipeline for OpenAI Gym and Gym-like MDP environments, a large-scale publicly available crowdsourced dataset of human gameplay demonstrations on single and multi-player Atari 2600 games, a collection of imitation and offline reinforcement learning benchmarks, and a detailed discussion of crowdsourcing and incentivization methodology. We hope that this will drive the improvement in design of algorithms that can account for the intricacies in the dataset and thereby, enable a step forward in direction of effective learning in real-world settings.

PDF Abstract