no code implementations • 17 Jan 2024 • Zhirui Chen, P. N. Karthik, Yeow Meng Chee, Vincent Y. F. Tan
We study best arm identification (BAI) in linear bandits in the fixed-budget regime under differential privacy constraints, when the arm rewards are supported on the unit interval.
no code implementations • 20 Oct 2023 • P. N. Karthik, Vincent Y. F. Tan, Arpan Mukherjee, Ali Tajer
It is shown that under every policy, the state-action visitation proportions satisfy a specific approximate flow conservation constraint and that these proportions match the optimal proportions dictated by the lower bound under any asymptotically optimal policy.
no code implementations • 10 May 2023 • Kota Srinivas Reddy, P. N. Karthik, Nikhil Karamchandani, Jayakrishnan Nair
The pulled arm and its instantaneous reward are revealed to the learner, whose goal is to find the best arm by minimising the expected stopping time, subject to an upper bound on the error probability.
no code implementations • 14 Oct 2022 • Zhirui Chen, P. N. Karthik, Vincent Y. F. Tan, Yeow Meng Chee
Furthermore, we show that for any algorithm whose upper bound on the expected stopping time matches with the lower bound up to a multiplicative constant ({\em almost-optimal} algorithm), the ratio of any two consecutive communication time instants must be {\em bounded}, a result that is of independent interest.
no code implementations • 19 Aug 2022 • Kota Srinivas Reddy, P. N. Karthik, Vincent Y. F. Tan
The local best arm at a client is the arm with the largest mean among the arms local to the client, whereas the global best arm is the arm with the largest average mean across all the clients.
no code implementations • 29 Mar 2022 • P. N. Karthik, Kota Srinivas Reddy, Vincent Y. F. Tan
For this problem, we derive the first-known problem instance-dependent asymptotic lower bound on the growth rate of the expected time required to find the index of the best arm, where the asymptotics is as the error probability vanishes.
no code implementations • 8 May 2021 • P. N. Karthik, Rajesh Sundaresan
This paper studies the problem of finding an anomalous arm in a multi-armed bandit when (a) each arm is a finite-state Markov process, and (b) the arms are restless.
no code implementations • 13 May 2020 • P. N. Karthik, Rajesh Sundaresan
The state space is common across the arms, and the arms are independent of each other.