no code implementations • 13 Feb 2023 • Le Cong Dinh, Tri-Dung Nguyen, Alain Zemkoho, Long Tran-Thanh
We study online learning problems in which the learner has extra knowledge about the adversary's behaviour, i. e., in game-theoretic settings where opponents typically follow some no-external regret learning algorithms.
no code implementations • 7 Oct 2021 • Le Cong Dinh, David Henry Mguni, Long Tran-Thanh, Jun Wang, Yaodong Yang
In this setting, we first demonstrate that MDP-Expert, an existing algorithm that works well with oblivious adversaries can still apply and achieve a policy regret bound of $\mathcal{O}(\sqrt{T \log(L)}+\tau^2\sqrt{ T \log(|A|)})$ where $L$ is the size of adversary's pure strategy set and $|A|$ denotes the size of agent's action space.
1 code implementation • 13 Mar 2021 • Le Cong Dinh, Yaodong Yang, Stephen Mcaleer, Zheng Tian, Nicolas Perez Nieves, Oliver Slumbers, David Henry Mguni, Haitham Bou Ammar, Jun Wang
Solving strategic games with huge action space is a critical yet under-explored topic in economics, operations research and artificial intelligence.
no code implementations • 22 Jul 2020 • Le Cong Dinh, Nick Bishop, Long Tran-Thanh
We investigate a repeated two-player zero-sum game setting where the column player is also a designer of the system, and has full control on the design of the payoff matrix.