Search Results for author: Ayush Sekhari

Found 27 papers, 3 papers with code

GaussMark: A Practical Approach for Structural Watermarking of Language Models

no code implementations17 Jan 2025 Adam Block, Ayush Sekhari, Alexander Rakhlin

In this work, we introduce a new scheme, GaussMark, that is simple and efficient to implement, has formal statistical guarantees on its efficacy, comes at no cost in generation latency, and embeds the watermark into the weights of the model itself, providing a structural watermark.

Random Latent Exploration for Deep Reinforcement Learning

no code implementations18 Jul 2024 Srinath Mahankali, Zhang-Wei Hong, Ayush Sekhari, Alexander Rakhlin, Pulkit Agrawal

The ability to efficiently explore high-dimensional state spaces is essential for the practical success of deep Reinforcement Learning (RL).

Deep Reinforcement Learning reinforcement-learning +1

Langevin Dynamics: A Unified Perspective on Optimization via Lyapunov Potentials

no code implementations5 Jul 2024 August Y. Chen, Ayush Sekhari, Karthik Sridharan

To our knowledge, the only strategy for showing global convergence of SGLD on the loss function is to show that SGLD can sample from a stationary distribution which assigns larger mass when the function is small (the Gibbs measure), and then to convert these guarantees to optimization results.

Machine Unlearning Fails to Remove Data Poisoning Attacks

no code implementations25 Jun 2024 Martin Pawelczyk, Jimmy Z. Di, Yiwei Lu, Gautam Kamath, Ayush Sekhari, Seth Neel

We revisit the efficacy of several practical methods for approximate machine unlearning developed for large-scale deep learning.

Data Poisoning Machine Unlearning

Computationally Efficient RL under Linear Bellman Completeness for Deterministic Dynamics

no code implementations17 Jun 2024 Runzhe Wu, Ayush Sekhari, Akshay Krishnamurthy, Wen Sun

We study computationally and statistically efficient Reinforcement Learning algorithms for the linear Bellman Complete setting.

Offline Reinforcement Learning: Role of State Aggregation and Trajectory Data

no code implementations25 Mar 2024 Zeyu Jia, Alexander Rakhlin, Ayush Sekhari, Chen-Yu Wei

We revisit the problem of offline reinforcement learning with value function realizability but without Bellman completeness.

reinforcement-learning

Harnessing Density Ratios for Online Reinforcement Learning

no code implementations18 Jan 2024 Philip Amortila, Dylan J. Foster, Nan Jiang, Ayush Sekhari, Tengyang Xie

The theories of offline and online reinforcement learning, despite having evolved in parallel, have begun to show signs of the possibility for a unification, with algorithms and analysis techniques for one setting often having natural counterparts in the other.

Offline RL reinforcement-learning +1

Offline Data Enhanced On-Policy Policy Gradient with Provable Guarantees

1 code implementation14 Nov 2023 Yifei Zhou, Ayush Sekhari, Yuda Song, Wen Sun

In this work, we propose a new hybrid RL algorithm that combines an on-policy actor-critic method with offline data.

Offline RL

Contextual Bandits and Imitation Learning via Preference-Based Active Queries

no code implementations24 Jul 2023 Ayush Sekhari, Karthik Sridharan, Wen Sun, Runzhe Wu

We consider the problem of contextual bandits and imitation learning, where the learner lacks direct knowledge of the executed action's reward.

Imitation Learning Multi-Armed Bandits

Ticketed Learning-Unlearning Schemes

no code implementations27 Jun 2023 Badih Ghazi, Pritish Kamath, Ravi Kumar, Pasin Manurangsi, Ayush Sekhari, Chiyuan Zhang

Subsequently, given any subset of examples that wish to be unlearnt, the goal is to learn, without the knowledge of the original training dataset, a good predictor that is identical to the predictor that would have been produced when learning from scratch on the surviving examples.

Hidden Poison: Machine Unlearning Enables Camouflaged Poisoning Attacks

1 code implementation NeurIPS 2023 Jimmy Z. Di, Jack Douglas, Jayadev Acharya, Gautam Kamath, Ayush Sekhari

We introduce camouflaged data poisoning attacks, a new attack vector that arises in the context of machine unlearning and other settings when model retraining may be induced.

Data Poisoning Machine Unlearning

Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient

1 code implementation13 Oct 2022 Yuda Song, Yifei Zhou, Ayush Sekhari, J. Andrew Bagnell, Akshay Krishnamurthy, Wen Sun

We consider a hybrid reinforcement learning setting (Hybrid RL), in which an agent has access to an offline dataset and the ability to collect experience via real-world online interaction.

Montezuma's Revenge Q-Learning

From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent

no code implementations13 Oct 2022 Satyen Kale, Jason D. Lee, Chris De Sa, Ayush Sekhari, Karthik Sridharan

When these potentials further satisfy certain self-bounding properties, we show that they can be used to provide a convergence guarantee for Gradient Descent (GD) and SGD (even when the paths of GF and GD/SGD are quite far apart).

Retrieval

On the Complexity of Adversarial Decision Making

no code implementations27 Jun 2022 Dylan J. Foster, Alexander Rakhlin, Ayush Sekhari, Karthik Sridharan

A central problem in online learning and decision making -- from bandits to reinforcement learning -- is to understand what modeling assumptions lead to sample-efficient learning guarantees.

Decision Making reinforcement-learning +2

Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings

no code implementations24 Jun 2022 Masatoshi Uehara, Ayush Sekhari, Jason D. Lee, Nathan Kallus, Wen Sun

We show our algorithm's computational and statistical complexities scale polynomially with respect to the horizon and the intrinsic dimension of the feature on the observation space.

Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation

no code implementations19 Jun 2022 Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

This paper presents a theoretical analysis of such policies and provides the first regret and sample-complexity bounds for reinforcement learning with myopic exploration.

reinforcement-learning Reinforcement Learning +1

SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs

no code implementations NeurIPS 2021 Satyen Kale, Ayush Sekhari, Karthik Sridharan

We show that there is an SCO problem such that GD with any step size and number of iterations can only learn at a suboptimal rate: at least $\widetilde{\Omega}(1/n^{5/12})$.

Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations

no code implementations NeurIPS 2021 Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

In this work, we consider the more realistic setting of agnostic RL with rich observation spaces and a fixed class of policies $\Pi$ that may not contain any near-optimal policy.

reinforcement-learning Reinforcement Learning +1

Neural Active Learning with Performance Guarantees

no code implementations NeurIPS 2021 Pranjal Awasthi, Christoph Dann, Claudio Gentile, Ayush Sekhari, Zhilei Wang

We investigate the problem of active learning in the streaming setting in non-parametric regimes, where the labels are stochastically generated from a class of functions on which we make no assumptions whatsoever.

Active Learning Model Selection

Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations

no code implementations24 Jun 2020 Yossi Arjevani, Yair Carmon, John C. Duchi, Dylan J. Foster, Ayush Sekhari, Karthik Sridharan

We design an algorithm which finds an $\epsilon$-approximate stationary point (with $\|\nabla F(x)\|\le \epsilon$) using $O(\epsilon^{-3})$ stochastic gradient and Hessian-vector products, matching guarantees that were previously available only under a stronger assumption of access to multiple queries with the same random seed.

Second-order methods Stochastic Optimization

Reinforcement Learning with Feedback Graphs

no code implementations NeurIPS 2020 Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

We study episodic reinforcement learning in Markov decision processes when the agent receives additional feedback per step in the form of several transition observations.

reinforcement-learning Reinforcement Learning +1

The Complexity of Making the Gradient Small in Stochastic Convex Optimization

no code implementations13 Feb 2019 Dylan J. Foster, Ayush Sekhari, Ohad Shamir, Nathan Srebro, Karthik Sridharan, Blake Woodworth

Notably, we show that in the global oracle/statistical learning model, only logarithmic dependence on smoothness is required to find a near-stationary point, whereas polynomial dependence on smoothness is necessary in the local stochastic oracle model.

Stochastic Optimization

Uniform Convergence of Gradients for Non-Convex Learning and Optimization

no code implementations NeurIPS 2018 Dylan J. Foster, Ayush Sekhari, Karthik Sridharan

We investigate 1) the rate at which refined properties of the empirical risk---in particular, gradients---converge to their population counterparts in standard non-convex learning tasks, and 2) the consequences of this convergence for optimization.

A Brief Study of In-Domain Transfer and Learning from Fewer Samples using A Few Simple Priors

no code implementations13 Jul 2017 Marc Pickett, Ayush Sekhari, James Davidson

Domain knowledge can often be encoded in the structure of a network, such as convolutional layers for vision, which has been shown to increase generalization and decrease sample complexity, or the number of samples required for successful learning.

Cannot find the paper you are looking for? You can Submit a new open access paper.