Search Results for author: Ayush Sekhari

Found 21 papers, 3 papers with code

Offline Reinforcement Learning: Role of State Aggregation and Trajectory Data

no code implementations25 Mar 2024 Zeyu Jia, Alexander Rakhlin, Ayush Sekhari, Chen-Yu Wei

We revisit the problem of offline reinforcement learning with value function realizability but without Bellman completeness.


Harnessing Density Ratios for Online Reinforcement Learning

no code implementations18 Jan 2024 Philip Amortila, Dylan J. Foster, Nan Jiang, Ayush Sekhari, Tengyang Xie

The theories of offline and online reinforcement learning, despite having evolved in parallel, have begun to show signs of the possibility for a unification, with algorithms and analysis techniques for one setting often having natural counterparts in the other.

Offline RL reinforcement-learning

Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees

1 code implementation14 Nov 2023 Yifei Zhou, Ayush Sekhari, Yuda Song, Wen Sun

In this work, we propose a new hybrid RL algorithm that combines an on-policy actor-critic method with offline data.

Offline RL

Contextual Bandits and Imitation Learning via Preference-Based Active Queries

no code implementations24 Jul 2023 Ayush Sekhari, Karthik Sridharan, Wen Sun, Runzhe Wu

We consider the problem of contextual bandits and imitation learning, where the learner lacks direct knowledge of the executed action's reward.

Imitation Learning Multi-Armed Bandits

Ticketed Learning-Unlearning Schemes

no code implementations27 Jun 2023 Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Ayush Sekhari, Chiyuan Zhang

Subsequently, given any subset of examples that wish to be unlearnt, the goal is to learn, without the knowledge of the original training dataset, a good predictor that is identical to the predictor that would have been produced when learning from scratch on the surviving examples.

Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks

1 code implementation NeurIPS 2023 Jimmy Z. Di, Jack Douglas, Jayadev Acharya, Gautam Kamath, Ayush Sekhari

We introduce camouflaged data poisoning attacks, a new attack vector that arises in the context of machine unlearning and other settings when model retraining may be induced.

Data Poisoning Machine Unlearning

Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient

1 code implementation13 Oct 2022 Yuda Song, Yifei Zhou, Ayush Sekhari, J. Andrew Bagnell, Akshay Krishnamurthy, Wen Sun

We consider a hybrid reinforcement learning setting (Hybrid RL), in which an agent has access to an offline dataset and the ability to collect experience via real-world online interaction.

Montezuma's Revenge Q-Learning

From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent

no code implementations13 Oct 2022 Satyen Kale, Jason D. Lee, Chris De Sa, Ayush Sekhari, Karthik Sridharan

When these potentials further satisfy certain self-bounding properties, we show that they can be used to provide a convergence guarantee for Gradient Descent (GD) and SGD (even when the paths of GF and GD/SGD are quite far apart).


On the Complexity of Adversarial Decision Making

no code implementations27 Jun 2022 Dylan J. Foster, Alexander Rakhlin, Ayush Sekhari, Karthik Sridharan

A central problem in online learning and decision making -- from bandits to reinforcement learning -- is to understand what modeling assumptions lead to sample-efficient learning guarantees.

Decision Making reinforcement-learning +1

Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings

no code implementations24 Jun 2022 Masatoshi Uehara, Ayush Sekhari, Jason D. Lee, Nathan Kallus, Wen Sun

We show our algorithm's computational and statistical complexities scale polynomially with respect to the horizon and the intrinsic dimension of the feature on the observation space.

Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation

no code implementations19 Jun 2022 Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

This paper presents a theoretical analysis of such policies and provides the first regret and sample-complexity bounds for reinforcement learning with myopic exploration.

reinforcement-learning Reinforcement Learning (RL)

SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs

no code implementations NeurIPS 2021 Satyen Kale, Ayush Sekhari, Karthik Sridharan

We show that there is an SCO problem such that GD with any step size and number of iterations can only learn at a suboptimal rate: at least $\widetilde{\Omega}(1/n^{5/12})$.

Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations

no code implementations NeurIPS 2021 Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

In this work, we consider the more realistic setting of agnostic RL with rich observation spaces and a fixed class of policies $\Pi$ that may not contain any near-optimal policy.

reinforcement-learning Reinforcement Learning (RL)

Neural Active Learning with Performance Guarantees

no code implementations NeurIPS 2021 Pranjal Awasthi, Christoph Dann, Claudio Gentile, Ayush Sekhari, Zhilei Wang

We investigate the problem of active learning in the streaming setting in non-parametric regimes, where the labels are stochastically generated from a class of functions on which we make no assumptions whatsoever.

Active Learning Model Selection

Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations

no code implementations24 Jun 2020 Yossi Arjevani, Yair Carmon, John C. Duchi, Dylan J. Foster, Ayush Sekhari, Karthik Sridharan

We design an algorithm which finds an $\epsilon$-approximate stationary point (with $\|\nabla F(x)\|\le \epsilon$) using $O(\epsilon^{-3})$ stochastic gradient and Hessian-vector products, matching guarantees that were previously available only under a stronger assumption of access to multiple queries with the same random seed.

Second-order methods Stochastic Optimization

Reinforcement Learning with Feedback Graphs

no code implementations NeurIPS 2020 Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

We study episodic reinforcement learning in Markov decision processes when the agent receives additional feedback per step in the form of several transition observations.

reinforcement-learning Reinforcement Learning (RL)

The Complexity of Making the Gradient Small in Stochastic Convex Optimization

no code implementations13 Feb 2019 Dylan J. Foster, Ayush Sekhari, Ohad Shamir, Nathan Srebro, Karthik Sridharan, Blake Woodworth

Notably, we show that in the global oracle/statistical learning model, only logarithmic dependence on smoothness is required to find a near-stationary point, whereas polynomial dependence on smoothness is necessary in the local stochastic oracle model.

Stochastic Optimization

Uniform Convergence of Gradients for Non-Convex Learning and Optimization

no code implementations NeurIPS 2018 Dylan J. Foster, Ayush Sekhari, Karthik Sridharan

We investigate 1) the rate at which refined properties of the empirical risk---in particular, gradients---converge to their population counterparts in standard non-convex learning tasks, and 2) the consequences of this convergence for optimization.

A Brief Study of In-Domain Transfer and Learning from Fewer Samples using A Few Simple Priors

no code implementations13 Jul 2017 Marc Pickett, Ayush Sekhari, James Davidson

Domain knowledge can often be encoded in the structure of a network, such as convolutional layers for vision, which has been shown to increase generalization and decrease sample complexity, or the number of samples required for successful learning.

Cannot find the paper you are looking for? You can Submit a new open access paper.