1 code implementation • 10 Apr 2024 • Ayush Sawarni, Nirjhar Das, Siddharth Barman, Gaurav Sinha
For our batch learning algorithm B-GLinCB, with $\Omega\left( \log{\log T} \right)$ batches, the regret scales as $\tilde{O}(\sqrt{T})$.
no code implementations • 8 May 2023 • Ayush Sawarni, Rahul Madhavan, Gaurav Sinha, Siddharth Barman
We study the causal bandit problem that entails identifying a near-optimal intervention from a specified set $A$ of (possibly non-atomic) interventions over a given causal graph.
no code implementations • 27 May 2022 • Siddharth Barman, Arindam Khan, Arnab Maiti, Ayush Sawarni
Since NSW is known to satisfy fairness axioms, our approach complements the utilitarian considerations of average (cumulative) regret, wherein the algorithm is evaluated via the arithmetic mean of its expected rewards.