Search Results for author: Gugan Thoppe

Found 13 papers, 0 papers with code

Global Convergence Guarantees for Federated Policy Gradient Methods with Adversaries

no code implementations • 15 Mar 2024 • Swetha Ganesh, Jiayu Chen, Gugan Thoppe, Vaneet Aggarwal

Federated Reinforcement Learning (FRL) allows multiple agents to collaboratively build a decision making policy without sharing raw trajectories.

Decision Making Policy Gradient Methods

Paper
Add Code

Risk Estimation in a Markov Cost Process: Lower and Upper Bounds

no code implementations • 17 Oct 2023 • Gugan Thoppe, L. A. Prashanth, Sanjay Bhat

To the best of our knowledge, our work is the first to provide lower and upper bounds for estimating any risk measure beyond the mean within a Markovian setting.

Paper
Add Code

Online Learning with Adversaries: A Differential-Inclusion Analysis

no code implementations • 4 Apr 2023 • Swetha Ganesh, Alexandre Reiffers-Masson, Gugan Thoppe

Our main result is that the proposed algorithm almost surely converges to the desired mean $\mu.$ This makes ours the first asynchronous FL method to have an a. s. convergence guarantee in the presence of adversaries.

Federated Learning

Paper
Add Code

SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree Search

no code implementations • 30 Jan 2023 • Gal Dalal, Assaf Hallak, Gugan Thoppe, Shie Mannor, Gal Chechik

We prove that the resulting variance decays exponentially with the planning horizon as a function of the expansion policy.

Policy Gradient Methods

Paper
Add Code

Improving Sample Efficiency in Evolutionary RL Using Off-Policy Ranking

no code implementations • 22 Aug 2022 • Eshwar S R, Shishir Kolathaya, Gugan Thoppe

This leads to a lot of wasteful interactions since, once the ranking is done, only the data associated with the top-ranked policies is used for subsequent learning.

Reinforcement Learning (RL)

Paper
Add Code

Demystifying Approximate Value-based RL with $ε$-greedy Exploration: A Differential Inclusion View

no code implementations • 26 May 2022 • Aditya Gopalan, Gugan Thoppe

Q-learning and SARSA with $\epsilon$-greedy exploration are leading reinforcement learning methods.

Q-Learning reinforcement-learning +1

Paper
Add Code

Does Momentum Help? A Sample Complexity Analysis

no code implementations • 29 Oct 2021 • Swetha Ganesh, Rohan Deb, Gugan Thoppe, Amarjit Budhiraja

Stochastic Heavy Ball (SHB) and Nesterov's Accelerated Stochastic Gradient (ASG) are popular momentum methods in stochastic optimization.

Stochastic Optimization

Paper
Add Code

A Law of Iterated Logarithm for Multi-Agent Reinforcement Learning

no code implementations • NeurIPS 2021 • Gugan Thoppe, Bhumesh Kumar

In Multi-Agent Reinforcement Learning (MARL), multiple agents interact with a common environment, as also with each other, for solving a shared problem in sequential decision-making.

Decision Making Multi-agent Reinforcement Learning +2

Paper
Add Code

Online Algorithms for Estimating Change Rates of Web Pages

no code implementations • 17 Sep 2020 • Konstantin Avrachenkov, Kishor Patil, Gugan Thoppe

We provide three novel schemes for online estimation of page change rates, all of which have extremely low running times per iteration.

Management

Paper
Add Code

Change Rate Estimation and Optimal Freshness in Web Page Crawling

no code implementations • 5 Apr 2020 • Konstantin Avrachenkov, Kishor Patil, Gugan Thoppe

Specifically, we provide two novel schemes for online estimation of page change rates.

Paper
Add Code

A Tale of Two-Timescale Reinforcement Learning with the Tightest Finite-Time Bound

no code implementations • 20 Nov 2019 • Gal Dalal, Balazs Szorenyi, Gugan Thoppe

Algorithms such as these have two iterates, $\theta_n$ and $w_n,$ which are updated using two distinct stepsize sequences, $\alpha_n$ and $\beta_n,$ respectively.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Finite Sample Analyses for TD(0) with Function Approximation

no code implementations • 4 Apr 2017 • Gal Dalal, Balázs Szörényi, Gugan Thoppe, Shie Mannor

TD(0) is one of the most commonly used algorithms in reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Finite Sample Analysis of Two-Timescale Stochastic Approximation with Applications to Reinforcement Learning

no code implementations • 15 Mar 2017 • Gal Dalal, Balazs Szorenyi, Gugan Thoppe, Shie Mannor

Using this, we provide a concentration bound, which is the first such result for a two-timescale SA.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.