no code implementations • 26 Jun 2023 • Yuwei Luo, Mohsen Bayati
This methodology enables us to formulate an instance-dependent frequentist regret bound, which incorporates the geometric information, for a broad class of base algorithms, including Greedy, OFUL, and Thompson sampling.
no code implementations • 6 Nov 2021 • Yuwei Luo, Varun Gupta, Mladen Kolar
Under the assumption that a sequence of stabilizing, but potentially sub-optimal controllers is available for all $t$, we present an algorithm that achieves the optimal dynamic regret of $\tilde{\mathcal{O}}\left(V_T^{2/5}T^{3/5}\right)$.
no code implementations • ICML 2020 • Sen Na, Yuwei Luo, Zhuoran Yang, Zhaoran Wang, Mladen Kolar
We consider the bipartite graph and formalize its representation learning problem as a statistical estimation problem of parameters in a semiparametric exponential family distribution.
no code implementations • 14 Dec 2019 • Yuwei Luo, Zhuoran Yang, Zhaoran Wang, Mladen Kolar
Multi-agent reinforcement learning has been successfully applied to a number of challenging problems.
Multi-agent Reinforcement Learning reinforcement-learning +1