Search Results for author: Daniel Vial

Found 9 papers, 1 papers with code

Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits

no code implementations30 May 2023 Ronshee Chawla, Daniel Vial, Sanjay Shakkottai, R. Srikant

The study of collaborative multi-agent bandits has attracted significant attention recently.

Multi-Armed Bandits

Minimax Regret for Cascading Bandits

no code implementations23 Mar 2022 Daniel Vial, Sujay Sanghavi, Sanjay Shakkottai, R. Srikant

Cascading bandits is a natural and popular model that frames the task of learning to rank from Bernoulli click feedback in a bandit setting.

Learning-To-Rank

Robust Multi-Agent Bandits Over Undirected Graphs

no code implementations28 Feb 2022 Daniel Vial, Sanjay Shakkottai, R. Srikant

Thus, we generalize existing regret bounds beyond the complete graph (where $d_{\text{mal}}(i) = m$), and show the effect of malicious agents is entirely local (in the sense that only the $d_{\text{mal}}(i)$ malicious agents directly connected to $i$ affect its long-term regret).

Improved Algorithms for Misspecified Linear Markov Decision Processes

no code implementations12 Sep 2021 Daniel Vial, Advait Parulekar, Sanjay Shakkottai, R. Srikant

(P1) Its regret after $K$ episodes scales as $K \max \{ \varepsilon_{\text{mis}}, \varepsilon_{\text{tol}} \}$, where $\varepsilon_{\text{mis}}$ is the degree of misspecification and $\varepsilon_{\text{tol}}$ is a user-specified error tolerance.

Multi-Armed Bandits

Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation

no code implementations4 May 2021 Daniel Vial, Advait Parulekar, Sanjay Shakkottai, R. Srikant

We propose an algorithm that uses linear function approximation (LFA) for stochastic shortest path (SSP).

One-bit feedback is sufficient for upper confidence bound policies

no code implementations4 Dec 2020 Daniel Vial, Sanjay Shakkottai, R. Srikant

We consider a variant of the traditional multi-armed bandit problem in which each arm is only able to provide one-bit feedback during each pull based on its past history of rewards.

Robust Multi-Agent Multi-Armed Bandits

no code implementations7 Jul 2020 Daniel Vial, Sanjay Shakkottai, R. Srikant

Recent works have shown that agents facing independent instances of a stochastic $K$-armed bandit can collaborate to decrease regret.

Distributed Computing Multi-Armed Bandits +1

Empirical Policy Evaluation with Supergraphs

no code implementations18 Feb 2020 Daniel Vial, Vijay Subramanian

We devise and analyze algorithms for the empirical policy evaluation problem in reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

On the role of clustering in Personalized PageRank estimation

1 code implementation4 Jun 2017 Daniel Vial, Vijay Subramanian

We then show that the common underlying graph can be leveraged to efficiently and jointly estimate PPR for many pairs, rather than treating each pair separately using the primitive algorithm.

Social and Information Networks

Cannot find the paper you are looking for? You can Submit a new open access paper.