A Sleeping, Recovering Bandit Algorithm for Optimizing Recurring Notifications

Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 2020 · Kevin P. Yancey, Burr Settles ·

Many online and mobile applications rely on daily emails and push notifications to increase and maintain user engagement. The multi-armed bandit approach provides a useful framework for optimizing the content of these notifications, but a number of complications (such as novelty effects and conditional eligibility) make conventional bandit algorithms unsuitable in practice. In this paper, we introduce the Recovering Difference Softmax Algorithm to address the particular challenges of this problem domain, and use it to successfully optimize millions of daily reminders for the online language-learning app Duolingo. This lead to a 0.5%. increase in total daily active users (DAUs) and a 2%, increase in new user retention over a strong baseline. We provide technical details of its design and deployment, and demonstrate its efficacy through both offline and online evaluation experiments.

PDF Abstract