Selfish optimization and collective learning in populations

15 Nov 2021  ·  Alex McAvoy, Yoichiro Mori, Joshua B. Plotkin ·

A selfish learner seeks to maximize their own success, disregarding others. When success is measured as payoff in a game played against another learner, mutual selfishness typically fails to produce the optimal outcome for a pair of individuals. However, learners often operate in populations, and each learner may have a limited duration of interaction with any other individual. Here, we compare selfish learning in stable pairs to selfish learning with stochastic encounters in a population. We study gradient-based optimization in repeated games like the prisoner's dilemma, which feature multiple Nash equilibria, many of which are suboptimal. We find that myopic, selfish learning, when distributed in a population via ephemeral encounters, can reverse the dynamics that occur in stable pairs. In particular, when there is flexibility in partner choice, selfish learning in large populations can produce optimal payoffs in repeated social dilemmas. This result holds for the entire population, not just for a small subset of individuals. Furthermore, as the population size grows, the timescale to reach the optimal population payoff remains finite in the number of learning steps per individual. While it is not universally true that interacting with many partners in a population improves outcomes, this form of collective learning achieves optimality for several important classes of social dilemmas. We conclude that na\"{i}ve learning can be surprisingly effective in populations of individuals navigating conflicts of interest.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here