no code implementations • 20 Feb 2023 • Muhammed O. Sayin, Onur Unlu
We present two logit-Q learning dynamics combining the classical and independent log-linear learning updates with an on-policy value iteration update for efficient learning in stochastic games.