Search Results for author: Abishek Sankararaman

Found 13 papers, 0 papers with code

Online Heavy-tailed Change-point detection

no code implementations15 Jun 2023 Abishek Sankararaman, Balakrishnan, Narayanaswamy

We derive guarantees on worst-case, finite-sample false-positive rate (FPR) over the family of all distributions with bounded second moment.

Change Point Detection

Predict-and-Critic: Accelerated End-to-End Predictive Control for Cloud Computing through Reinforcement Learning

no code implementations2 Dec 2022 Kaustubh Sridhar, Vikramank Singh, Balakrishnan Narayanaswamy, Abishek Sankararaman

PnC jointly trains a prediction model and a terminal Q function that approximates cost-to-go over a long horizon, by back-propagating the cost of decisions through the optimization problem \emph{and from the future}.

Cloud Computing Model Predictive Control

Decentralized Competing Bandits in Non-Stationary Matching Markets

no code implementations31 May 2022 Avishek Ghosh, Abishek Sankararaman, Kannan Ramchandran, Tara Javidi, Arya Mazumdar

We propose and analyze a decentralized and asynchronous learning algorithm, namely Decentralized Non-stationary Competing Bandits (\texttt{DNCB}), where the agents play (restrictive) successive elimination type learning algorithms to learn their preference over the arms.

Breaking the $\sqrt{T}$ Barrier: Instance-Independent Logarithmic Regret in Stochastic Contextual Linear Bandits

no code implementations19 May 2022 Avishek Ghosh, Abishek Sankararaman

The (poly) logarithmic regret of \texttt{LR-SCB} stems from two crucial facts: (a) the application of a norm adaptive algorithm to exploit the parameter estimation and (b) an analysis of the shifted linear contextual bandit algorithm, showing that shifting results in increasing regret.

Multi-Armed Bandits

Model Selection for Generic Contextual Bandits

no code implementations7 Jul 2021 Avishek Ghosh, Abishek Sankararaman, Kannan Ramchandran

We consider the problem of model selection for the general stochastic contextual bandits under the realizability assumption.

Model Selection Multi-Armed Bandits

Adaptive Clustering and Personalization in Multi-Agent Stochastic Linear Bandits

no code implementations15 Jun 2021 Avishek Ghosh, Abishek Sankararaman, Kannan Ramchandran

We show that, for any agent, the regret scales as $\mathcal{O}(\sqrt{T/N})$, if the agent is in a `well separated' cluster, or scales as $\mathcal{O}(T^{\frac{1}{2} + \varepsilon}/(N)^{\frac{1}{2} -\varepsilon})$ if its cluster is not well separated, where $\varepsilon$ is positive and arbitrarily close to $0$.

Clustering

Beyond $\log^2(T)$ Regret for Decentralized Bandits in Matching Markets

no code implementations12 Mar 2021 Soumya Basu, Karthik Abinav Sankararaman, Abishek Sankararaman

We design decentralized algorithms for regret minimization in the two-sided matching market with one-sided bandit feedback that significantly improves upon the prior works (Liu et al. 2020a, 2020b, Sankararaman et al. 2020).

Multi-Agent Low-Dimensional Linear Bandits

no code implementations2 Jul 2020 Ronshee Chawla, Abishek Sankararaman, Sanjay Shakkottai

We study a multi-agent stochastic linear bandit with side information, parameterized by an unknown vector $\theta^* \in \mathbb{R}^d$.

Dominate or Delete: Decentralized Competing Bandits in Serial Dictatorship

no code implementations26 Jun 2020 Abishek Sankararaman, Soumya Basu, Karthik Abinav Sankararaman

Online learning in a two-sided matching market, with demand side agents continuously competing to be matched with supply side (arms), abstracts the complex interactions under partial information on matching platforms (e. g. UpWork, TaskRabbit).

The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits

no code implementations15 Jan 2020 Ronshee Chawla, Abishek Sankararaman, Ayalvadi Ganesh, Sanjay Shakkottai

Agents use the communication medium to recommend only arm-IDs (not samples), and thus update the set of arms from which they play.

ComHapDet: A Spatial Community Detection Algorithm for Haplotype Assembly

no code implementations27 Nov 2019 Abishek Sankararaman, Haris Vikalo, François Baccelli

Results: We propose a novel graphical representation of sequencing reads and pose the haplotype assembly problem as an instance of community detection on a spatial random graph.

Community Detection

Social Learning in Multi Agent Multi Armed Bandits

no code implementations4 Oct 2019 Abishek Sankararaman, Ayalvadi Ganesh, Sanjay Shakkottai

Our setting consists of a large number of agents $n$ that collaboratively and simultaneously solve the same instance of $K$ armed MAB to minimize the average cumulative regret over all agents.

Multi-Armed Bandits

Cannot find the paper you are looking for? You can Submit a new open access paper.