Search Results for author: Karthik Sridharan

Found 49 papers, 3 papers with code

Online Learning with Unknown Constraints

no code implementations6 Mar 2024 Karthik Sridharan, Seung Won Wilson Yoo

We consider the problem of online learning where the sequence of actions played by the learner must adhere to an unknown safety constraint at every round.

regression

Contextual Bandits and Imitation Learning via Preference-Based Active Queries

no code implementations24 Jul 2023 Ayush Sekhari, Karthik Sridharan, Wen Sun, Runzhe Wu

We consider the problem of contextual bandits and imitation learning, where the learner lacks direct knowledge of the executed action's reward.

Imitation Learning Multi-Armed Bandits

From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent

no code implementations13 Oct 2022 Satyen Kale, Jason D. Lee, Chris De Sa, Ayush Sekhari, Karthik Sridharan

When these potentials further satisfy certain self-bounding properties, we show that they can be used to provide a convergence guarantee for Gradient Descent (GD) and SGD (even when the paths of GF and GD/SGD are quite far apart).

Retrieval

On the Complexity of Adversarial Decision Making

no code implementations27 Jun 2022 Dylan J. Foster, Alexander Rakhlin, Ayush Sekhari, Karthik Sridharan

A central problem in online learning and decision making -- from bandits to reinforcement learning -- is to understand what modeling assumptions lead to sample-efficient learning guarantees.

Decision Making reinforcement-learning +1

Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation

no code implementations19 Jun 2022 Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

This paper presents a theoretical analysis of such policies and provides the first regret and sample-complexity bounds for reinforcement learning with myopic exploration.

reinforcement-learning Reinforcement Learning (RL)

SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs

no code implementations NeurIPS 2021 Satyen Kale, Ayush Sekhari, Karthik Sridharan

We show that there is an SCO problem such that GD with any step size and number of iterations can only learn at a suboptimal rate: at least $\widetilde{\Omega}(1/n^{5/12})$.

Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations

no code implementations NeurIPS 2021 Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

In this work, we consider the more realistic setting of agnostic RL with rich observation spaces and a fixed class of policies $\Pi$ that may not contain any near-optimal policy.

reinforcement-learning Reinforcement Learning (RL)

Online learning with dynamics: A minimax perspective

no code implementations NeurIPS 2020 Kush Bhatia, Karthik Sridharan

In this setting, we study the problem of minimizing policy regret and provide non-constructive upper bounds on the minimax rate for the problem.

counterfactual

Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations

no code implementations24 Jun 2020 Yossi Arjevani, Yair Carmon, John C. Duchi, Dylan J. Foster, Ayush Sekhari, Karthik Sridharan

We design an algorithm which finds an $\epsilon$-approximate stationary point (with $\|\nabla F(x)\|\le \epsilon$) using $O(\epsilon^{-3})$ stochastic gradient and Hessian-vector products, matching guarantees that were previously available only under a stronger assumption of access to multiple queries with the same random seed.

Second-order methods Stochastic Optimization

Reinforcement Learning with Feedback Graphs

no code implementations NeurIPS 2020 Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

We study episodic reinforcement learning in Markov decision processes when the agent receives additional feedback per step in the form of several transition observations.

reinforcement-learning Reinforcement Learning (RL)

Hypothesis Set Stability and Generalization

no code implementations NeurIPS 2019 Dylan J. Foster, Spencer Greenberg, Satyen Kale, Haipeng Luo, Mehryar Mohri, Karthik Sridharan

Our main result is a generalization bound for data-dependent hypothesis sets expressed in terms of a notion of hypothesis set stability and a notion of Rademacher complexity for data-dependent hypothesis sets that we introduce.

Distributed Learning with Sublinear Communication

no code implementations28 Feb 2019 Jayadev Acharya, Christopher De Sa, Dylan J. Foster, Karthik Sridharan

In distributed statistical learning, $N$ samples are split across $m$ machines and a learner wishes to use minimal communication to learn as well as if the examples were on a single machine.

Quantization

The Complexity of Making the Gradient Small in Stochastic Convex Optimization

no code implementations13 Feb 2019 Dylan J. Foster, Ayush Sekhari, Ohad Shamir, Nathan Srebro, Karthik Sridharan, Blake Woodworth

Notably, we show that in the global oracle/statistical learning model, only logarithmic dependence on smoothness is required to find a near-stationary point, whereas polynomial dependence on smoothness is necessary in the local stochastic oracle model.

Stochastic Optimization

Uniform Convergence of Gradients for Non-Convex Learning and Optimization

no code implementations NeurIPS 2018 Dylan J. Foster, Ayush Sekhari, Karthik Sridharan

We investigate 1) the rate at which refined properties of the empirical risk---in particular, gradients---converge to their population counterparts in standard non-convex learning tasks, and 2) the consequences of this convergence for optimization.

Optimization with Non-Differentiable Constraints with Applications to Fairness, Recall, Churn, and Other Goals

1 code implementation11 Sep 2018 Andrew Cotter, Heinrich Jiang, Serena Wang, Taman Narayan, Maya Gupta, Seungil You, Karthik Sridharan

This new formulation leads to an algorithm that produces a stochastic classifier by playing a two-player non-zero-sum game solving for what we call a semi-coarse correlated equilibrium, which in turn corresponds to an approximately optimal and feasible solution to the constrained optimization problem.

Fairness

Training Well-Generalizing Classifiers for Fairness Metrics and Other Data-Dependent Constraints

1 code implementation29 Jun 2018 Andrew Cotter, Maya Gupta, Heinrich Jiang, Nathan Srebro, Karthik Sridharan, Serena Wang, Blake Woodworth, Seungil You

Classifiers can be trained with data-dependent constraints to satisfy fairness goals, reduce churn, achieve a targeted false positive rate, or other policy goals.

Fairness

Two-Player Games for Efficient Non-Convex Constrained Optimization

1 code implementation17 Apr 2018 Andrew Cotter, Heinrich Jiang, Karthik Sridharan

For both the proxy-Lagrangian and Lagrangian formulations, however, we prove that this classifier, instead of having unbounded size, can be taken to be a distribution over no more than m+1 models (where m is the number of constraints).

BIG-bench Machine Learning Vocal Bursts Valence Prediction

Logistic Regression: The Importance of Being Improper

no code implementations25 Mar 2018 Dylan J. Foster, Satyen Kale, Haipeng Luo, Mehryar Mohri, Karthik Sridharan

Starting with the simple observation that the logistic loss is $1$-mixable, we design a new efficient improper learning algorithm for online logistic regression that circumvents the aforementioned lower bound with a regret bound exhibiting a doubly-exponential improvement in dependence on the predictor norm.

regression

Online Learning: Sufficient Statistics and the Burkholder Method

no code implementations20 Mar 2018 Dylan J. Foster, Alexander Rakhlin, Karthik Sridharan

We uncover a fairly general principle in online learning: If regret can be (approximately) expressed as a function of certain "sufficient statistics" for the data sequence, then there exists a special Burkholder function that 1) can be used algorithmically to achieve the regret bound and 2) only depends on these sufficient statistics, not the entire data sequence, so that the online strategy is only required to keep the sufficient statistics in memory.

Parameter-free online learning via model selection

no code implementations NeurIPS 2017 Dylan J. Foster, Satyen Kale, Mehryar Mohri, Karthik Sridharan

We introduce an efficient algorithmic framework for model selection in online learning, also known as parameter-free online learning.

Model Selection

Small-loss bounds for online learning with partial information

no code implementations9 Nov 2017 Thodoris Lykouris, Karthik Sridharan, Eva Tardos

We develop a black-box approach for such problems where the learner observes as feedback only losses of a subset of the actions that includes the selected action.

Multi-Armed Bandits

ZigZag: A new approach to adaptive online learning

no code implementations13 Apr 2017 Dylan J. Foster, Alexander Rakhlin, Karthik Sridharan

To develop a general theory of when this type of adaptive regret bound is achievable we establish a connection to the theory of decoupling inequalities for martingales in Banach spaces.

Inference in Sparse Graphs with Pairwise Measurements and Side Information

no code implementations8 Mar 2017 Dylan J. Foster, Daniel Reichman, Karthik Sridharan

For two-dimensional grids, our results improve over Globerson et al. (2015) by obtaining optimal recovery in the constant-height regime.

Learning Theory Tree Decomposition

A Tutorial on Online Supervised Learning with Applications to Node Classification in Social Networks

no code implementations31 Aug 2016 Alexander Rakhlin, Karthik Sridharan

We revisit the elegant observation of T. Cover '65 which, perhaps, is not as well-known to the broader community as it should be.

General Classification Node Classification

Learning in Games: Robustness of Fast Convergence

no code implementations NeurIPS 2016 Dylan J. Foster, Zhiyuan Li, Thodoris Lykouris, Karthik Sridharan, Eva Tardos

We show that learning algorithms satisfying a $\textit{low approximate regret}$ property experience fast convergence to approximate optimality in a large class of repeated games.

Private Causal Inference

no code implementations17 Dec 2015 Matt J. Kusner, Yu Sun, Karthik Sridharan, Kilian Q. Weinberger

Causal inference has the potential to have significant impact on medical research, prevention and control of diseases, and identifying factors that impact economic changes to name just a few.

Causal Inference

On Equivalence of Martingale Tail Bounds and Deterministic Regret Inequalities

no code implementations13 Oct 2015 Alexander Rakhlin, Karthik Sridharan

We study an equivalence of (i) deterministic pathwise statements appearing in the online learning literature (termed \emph{regret bounds}), (ii) high-probability tail bounds for the supremum of a collection of martingales (of a specific form arising from uniform laws of large numbers for martingales), and (iii) in-expectation bounds for the supremum.

Adaptive Online Learning

no code implementations NeurIPS 2015 Dylan J. Foster, Alexander Rakhlin, Karthik Sridharan

We propose a general framework for studying adaptive regret bounds in the online learning framework, including model selection bounds and data-dependent bounds.

Model Selection

Hierarchies of Relaxations for Online Prediction Problems with Evolving Constraints

no code implementations4 Mar 2015 Alexander Rakhlin, Karthik Sridharan

We study online prediction where regret of the algorithm is measured against a benchmark defined via evolving constraints.

Learning with Square Loss: Localization through Offset Rademacher Complexity

no code implementations21 Feb 2015 Tengyuan Liang, Alexander Rakhlin, Karthik Sridharan

We consider regression with square loss and general classes of functions without the boundedness assumption.

regression

Sequential Probability Assignment with Binary Alphabets and Large Classes of Experts

no code implementations29 Jan 2015 Alexander Rakhlin, Karthik Sridharan

We analyze the problem of sequential probability assignment for binary outcomes with side information and logarithmic loss, where regret---or, redundancy---is measured with respect to a (possibly infinite) class of experts.

Online Nonparametric Regression with General Loss Functions

no code implementations26 Jan 2015 Alexander Rakhlin, Karthik Sridharan

This paper establishes minimax rates for online regression with arbitrary classes of functions and general losses.

regression

Online Optimization : Competing with Dynamic Comparators

no code implementations26 Jan 2015 Ali Jadbabaie, Alexander Rakhlin, Shahin Shahrampour, Karthik Sridharan

Recent literature on online learning has focused on developing adaptive algorithms that take advantage of a regularity of the sequence of observations, yet retain worst-case performance guarantees.

Online Nonparametric Regression

no code implementations11 Feb 2014 Alexander Rakhlin, Karthik Sridharan

The optimal rates are shown to exhibit a phase transition analogous to the i. i. d./statistical learning case, studied in (Rakhlin, Sridharan, Tsybakov 2013).

regression

Optimization, Learning, and Games with Predictable Sequences

no code implementations NeurIPS 2013 Alexander Rakhlin, Karthik Sridharan

We provide several applications of Optimistic Mirror Descent, an online learning algorithm based on the idea of predictable sequences.

Empirical entropy, minimax regret and minimax risk

no code implementations6 Aug 2013 Alexander Rakhlin, Karthik Sridharan, Alexandre B. Tsybakov

Furthermore, for $p\in(0, 2)$, the excess risk rate matches the behavior of the minimax risk of function estimation in regression problems under the well-specified model.

Math regression

Online Learning with Predictable Sequences

no code implementations18 Aug 2012 Alexander Rakhlin, Karthik Sridharan

Variance and path-length bounds can be seen as particular examples of online learning with simple predictable sequences.

Model Selection Time Series +1

Better Mini-Batch Algorithms via Accelerated Gradient Methods

no code implementations NeurIPS 2011 Andrew Cotter, Ohad Shamir, Nati Srebro, Karthik Sridharan

Mini-batch algorithms have recently received significant attention as a way to speed-up stochastic convex optimization problems.

Online Learning: Stochastic, Constrained, and Smoothed Adversaries

no code implementations NeurIPS 2011 Alexander Rakhlin, Karthik Sridharan, Ambuj Tewari

We define the minimax value of a game where the adversary is restricted in his moves, capturing stochastic and non-stochastic assumptions on data.

Learning Theory

On the Universality of Online Mirror Descent

no code implementations NeurIPS 2011 Nati Srebro, Karthik Sridharan, Ambuj Tewari

We show that for a general class of convex online learning problems, Mirror Descent can always achieve a (nearly) optimal regret guarantee.

Smoothness, Low Noise and Fast Rates

no code implementations NeurIPS 2010 Nathan Srebro, Karthik Sridharan, Ambuj Tewari

We establish an excess risk bound of O(H R_n^2 + sqrt{H L*} R_n) for ERM with an H-smooth loss function and a hypothesis class with Rademacher complexity R_n, where L* is the best risk achievable by the hypothesis class.

Online Learning via Sequential Complexities

no code implementations6 Jun 2010 Alexander Rakhlin, Karthik Sridharan, Ambuj Tewari

We consider the problem of sequential prediction and provide tools to study the minimax value of the associated game.

Learning Theory

Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity

no code implementations31 Oct 2009 Sham M. Kakade, Ohad Shamir, Karthik Sridharan, Ambuj Tewari

The versatility of exponential families, along with their attendant convexity properties, make them a popular and effective statistical model.

Vocal Bursts Intensity Prediction

Fast Rates for Regularized Objectives

no code implementations NeurIPS 2008 Karthik Sridharan, Shai Shalev-Shwartz, Nathan Srebro

We show that the empirical minimizer of a stochastic strongly convex objective, where the stochastic component is linear, converges to the population minimizer with rate $O(1/n)$.

Cannot find the paper you are looking for? You can Submit a new open access paper.