Temperature Regret Matching for Imperfect-Information Games
Counterfactual regret minimization (CFR) methods are effective for solving two player zero-sum extensive games with imperfect information. Regret matching (RM) plays a crucial role in CFR and its variants to approach Nash equilibrium. In this paper, we present Temperature Regret Matching (TRM), a novel RM algorithm that adopts a different strategy. Also, we consider not only the opponent's strategy under the current strategy but also the opponent's strategies of the several last iterations for updating the external regret of each iteration. Furthermore, we theoretically demonstrate that the update of TRM converges to Nash Equilibrium. Competitive results in imperfect-information games have verified its effectiveness and efficiency.
PDF Abstract