no code implementations • 25 Aug 2022 • Yue Wu, Jesús A. De Loera
The Gr\"obner basis can be seen as a set of connecting moves (actions) of the game.
no code implementations • 12 Jun 2022 • Yue Wu, Jesús A. De Loera
GPI updates the policy of a single state by switching to an action that is mapped to the boundary of the value function polytope, followed by an immediate update of the value function.