no code implementations • 23 Jul 2014 • Panayotis Mertikopoulos, William H. Sandholm
We investigate a class of reinforcement learning dynamics where players adjust their strategies based on their actions' cumulative payoffs over time - specifically, by playing mixed strategies that maximize their expected cumulative payoff minus a regularization term.