no code implementations • 15 Mar 2024 • Swetha Ganesh, Jiayu Chen, Gugan Thoppe, Vaneet Aggarwal
Federated Reinforcement Learning (FRL) allows multiple agents to collaboratively build a decision making policy without sharing raw trajectories.
no code implementations • 17 Oct 2023 • Gugan Thoppe, L. A. Prashanth, Sanjay Bhat
To the best of our knowledge, our work is the first to provide lower and upper bounds for estimating any risk measure beyond the mean within a Markovian setting.
no code implementations • 4 Apr 2023 • Swetha Ganesh, Alexandre Reiffers-Masson, Gugan Thoppe
Our main result is that the proposed algorithm almost surely converges to the desired mean $\mu.$ This makes ours the first asynchronous FL method to have an a. s. convergence guarantee in the presence of adversaries.
no code implementations • 30 Jan 2023 • Gal Dalal, Assaf Hallak, Gugan Thoppe, Shie Mannor, Gal Chechik
We prove that the resulting variance decays exponentially with the planning horizon as a function of the expansion policy.
no code implementations • 22 Aug 2022 • Eshwar S R, Shishir Kolathaya, Gugan Thoppe
This leads to a lot of wasteful interactions since, once the ranking is done, only the data associated with the top-ranked policies is used for subsequent learning.
no code implementations • 26 May 2022 • Aditya Gopalan, Gugan Thoppe
Q-learning and SARSA with $\epsilon$-greedy exploration are leading reinforcement learning methods.
no code implementations • 29 Oct 2021 • Swetha Ganesh, Rohan Deb, Gugan Thoppe, Amarjit Budhiraja
Stochastic Heavy Ball (SHB) and Nesterov's Accelerated Stochastic Gradient (ASG) are popular momentum methods in stochastic optimization.
no code implementations • NeurIPS 2021 • Gugan Thoppe, Bhumesh Kumar
In Multi-Agent Reinforcement Learning (MARL), multiple agents interact with a common environment, as also with each other, for solving a shared problem in sequential decision-making.
no code implementations • 17 Sep 2020 • Konstantin Avrachenkov, Kishor Patil, Gugan Thoppe
We provide three novel schemes for online estimation of page change rates, all of which have extremely low running times per iteration.
no code implementations • 5 Apr 2020 • Konstantin Avrachenkov, Kishor Patil, Gugan Thoppe
Specifically, we provide two novel schemes for online estimation of page change rates.
no code implementations • 20 Nov 2019 • Gal Dalal, Balazs Szorenyi, Gugan Thoppe
Algorithms such as these have two iterates, $\theta_n$ and $w_n,$ which are updated using two distinct stepsize sequences, $\alpha_n$ and $\beta_n,$ respectively.
no code implementations • 4 Apr 2017 • Gal Dalal, Balázs Szörényi, Gugan Thoppe, Shie Mannor
TD(0) is one of the most commonly used algorithms in reinforcement learning.
no code implementations • 15 Mar 2017 • Gal Dalal, Balazs Szorenyi, Gugan Thoppe, Shie Mannor
Using this, we provide a concentration bound, which is the first such result for a two-timescale SA.