no code implementations • 9 Jun 2023 • Xiaotong Cheng, Setareh Maghsudi
We study a structured multi-agent multi-armed bandit (MAMAB) problem in a dynamic environment.
no code implementations • 30 May 2023 • Haniyeh Barghi, Xiaotong Cheng, Setareh Maghsudi
We present a novel approach to address the multi-agent sparse contextual linear bandit problem, in which the feature vectors have a high dimension $d$ whereas the reward function depends on only a limited set of features - precisely $s_0 \ll d$.
no code implementations • 28 Mar 2022 • Xiaotong Cheng, Setareh Maghsudi
One strategy, namely bandit gradient ascent with momentum, is an online convex optimization algorithm with bandit feedback.