Search Results for author: Ambuj Tewari

Found 106 papers, 16 papers with code

Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks

no code implementations3 Mar 2024 Ziping Xu, Zifan Xu, Runxuan Jiang, Peter Stone, Ambuj Tewari

Multitask Reinforcement Learning (MTRL) approaches have gained increasing attention for its wide applications in many important Reinforcement Learning (RL) tasks.

reinforcement-learning Reinforcement Learning (RL)

Optimal Thresholding Linear Bandit

no code implementations11 Feb 2024 Eduardo Ochoa Rivera, Ambuj Tewari

We study a novel pure exploration problem: the $\epsilon$-Thresholding Bandit Problem (TBP) with fixed confidence in stochastic linear bandits.

The Complexity of Sequential Prediction in Dynamical Systems

no code implementations9 Feb 2024 Vinod Raman, Unique Subedi, Ambuj Tewari

We study the problem of learning to predict the next state of a dynamical system when the underlying evolution function is unknown.

Learning Theory

A Primal-Dual Algorithm for Offline Constrained Reinforcement Learning with Low-Rank MDPs

no code implementations7 Feb 2024 Kihyuk Hong, Ambuj Tewari

Our algorithm is the first computationally efficient algorithm in this setting that achieves sample complexity of $O(\epsilon^{-2})$ with partial data coverage assumption.

Offline RL Reinforcement Learning (RL)

A Framework for Partially Observed Reward-States in RLHF

no code implementations5 Feb 2024 Chinmaya Kausik, Mirco Mutti, Aldo Pacchiano, Ambuj Tewari

We show reductions from the the two dominant forms of human feedback in RLHF - cardinal and dueling feedback to PORRL.

reinforcement-learning

Apple Tasting: Combinatorial Dimensions and Minimax Rates

no code implementations29 Oct 2023 Vinod Raman, Unique Subedi, Ananth Raman, Ambuj Tewari

In particular, we show that in the realizable setting, the expected number of mistakes of any learner, under apple tasting feedback, can be $\Theta(1), \Theta(\sqrt{T})$, or $\Theta(T)$.

Binary Classification

Sequence Length Independent Norm-Based Generalization Bounds for Transformers

1 code implementation19 Oct 2023 Jacob Trauger, Ambuj Tewari

This paper provides norm-based generalization bounds for the Transformer architecture that do not depend on the input sequence length.

Generalization Bounds

Conformal Contextual Robust Optimization

no code implementations16 Oct 2023 Yash Patel, Sahana Rayan, Ambuj Tewari

Data-driven approaches to predict-then-optimize decision-making problems seek to mitigate the risk of uncertainty region misspecification in safety-critical settings.

Conformal Prediction Decision Making

On the Computational Complexity of Private High-dimensional Model Selection

1 code implementation11 Oct 2023 Saptarshi Roy, Zehua Wang, Ambuj Tewari

We consider the problem of model selection in a high-dimensional sparse linear regression model under privacy constraints.

Model Selection regression

Online Infinite-Dimensional Regression: Learning Linear Operators

no code implementations8 Sep 2023 Vinod Raman, Unique Subedi, Ambuj Tewari

Finally, we prove that the impossibility result and the separation between uniform convergence and learnability also hold in the batch setting.

regression

On the Minimax Regret in Online Ranking with Top-k Feedback

no code implementations5 Sep 2023 Mingyuan Zhang, Ambuj Tewari

In online ranking, a learning algorithm sequentially ranks a set of items and receives feedback on its ranking in the form of relevance scores.

A Combinatorial Characterization of Supervised Online Learnability

no code implementations7 Jul 2023 Vinod Raman, Unique Subedi, Ambuj Tewari

We study the online learnability of hypothesis classes with respect to arbitrary, but bounded loss functions.

Learning Theory regression

A Primal-Dual-Critic Algorithm for Offline Constrained Reinforcement Learning

no code implementations13 Jun 2023 Kihyuk Hong, Yuhang Li, Ambuj Tewari

Offline constrained reinforcement learning (RL) aims to learn a policy that maximizes the expected cumulative reward subject to constraints on expected cumulative cost using an existing dataset.

reinforcement-learning Reinforcement Learning (RL)

Online Learning with Set-Valued Feedback

no code implementations9 Jun 2023 Vinod Raman, Unique Subedi, Ambuj Tewari

We study a variant of online multiclass classification where the learner predicts a single label but receives a \textit{set of labels} as feedback.

Amortized Variational Inference with Coverage Guarantees

no code implementations23 May 2023 Yash Patel, Declan McNamara, Jackson Loper, Jeffrey Regier, Ambuj Tewari

We prove lower bounds on the predictive efficiency of the regions produced by CANVI and explore how the quality of a posterior approximation relates to the predictive efficiency of prediction regions based on that approximation.

Variational Inference

Multiclass Online Learning and Uniform Convergence

no code implementations30 Mar 2023 Steve Hanneke, Shay Moran, Vinod Raman, Unique Subedi, Ambuj Tewari

We argue that the best expert has regret at most Littlestone dimension relative to the best concept in the class.

Binary Classification

Quantum Learning Theory Beyond Batch Binary Classification

no code implementations15 Feb 2023 Preetham Mohan, Ambuj Tewari

Arunachalam and de Wolf (2018) showed that the sample complexity of quantum batch learning of boolean functions, in the realizable and agnostic settings, has the same form and order as the corresponding classical sample complexities.

Binary Classification Classification +1

An Asymptotically Optimal Algorithm for the Convex Hull Membership Problem

no code implementations3 Feb 2023 Gang Qiao, Ambuj Tewari

This work studies the pure-exploration setting for the convex hull membership (CHM) problem where one aims to efficiently and accurately determine if a given point lies in the convex hull of means of a finite set of distributions.

Understanding Best Subset Selection: A Tale of Two C(omplex)ities

no code implementations16 Jan 2023 Saptarshi Roy, Ambuj Tewari, Ziwei Zhu

Furthermore, we show that a margin condition depending on similar margin quantity and complexity measures is also necessary for model consistency of BSS.

Model Selection Variable Selection +1

A Characterization of Multioutput Learnability

no code implementations6 Jan 2023 Vinod Raman, Unique Subedi, Ambuj Tewari

This provides a complete characterization of the learnability of multilabel classification and multioutput regression in both batch and online settings.

regression

Offline Policy Evaluation and Optimization under Confounding

no code implementations29 Nov 2022 Chinmaya Kausik, Yangyi Lu, Kevin Tan, Maggie Makar, Yixin Wang, Ambuj Tewari

Evaluating and optimizing policies in the presence of unobserved confounders is a problem of growing interest in offline reinforcement learning.

Offline RL Off-policy evaluation

RL Boltzmann Generators for Conformer Generation in Data-Sparse Environments

1 code implementation19 Nov 2022 Yash Patel, Ambuj Tewari

The generation of conformers has been a long-standing interest to structural chemists and biologists alike.

Learning Mixtures of Markov Chains and MDPs

1 code implementation17 Nov 2022 Chinmaya Kausik, Kevin Tan, Ambuj Tewari

We present an algorithm for learning mixtures of Markov chains and Markov decision processes (MDPs) from short unlabeled trajectories.

Online Agnostic Multiclass Boosting

1 code implementation30 May 2022 Vinod Raman, Ambuj Tewari

In this way, boosting algorithms convert weak learners into strong ones.

Binary Classification

Adaptive Sampling for Discovery

no code implementations30 May 2022 Ziping Xu, Eunjae Shim, Ambuj Tewari, Paul Zimmerman

Starting with a large unlabeled dataset, algorithms for ASD adaptively label the points with the goal to maximize the sum of responses.

Decision Making Drug Discovery

An Optimization-based Algorithm for Non-stationary Kernel Bandits without Prior Knowledge

no code implementations29 May 2022 Kihyuk Hong, Yuhang Li, Ambuj Tewari

Moreover, when applied to the non-stationary linear bandit setting by using a linear kernel, our algorithm is nearly minimax optimal, solving an open problem in the non-stationary linear bandit literature.

Achieving Representative Data via Convex Hull Feasibility Sampling Algorithms

no code implementations13 Apr 2022 Laura Niss, Yuekai Sun, Ambuj Tewari

Sampling biases in training data are a major source of algorithmic biases in machine learning systems.

Joint Learning of Linear Time-Invariant Dynamical Systems

no code implementations21 Dec 2021 Aditya Modi, Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

Linear time-invariant systems are very popular models in system theory and applications.

Balancing Adaptability and Non-exploitability in Repeated Games

1 code implementation20 Dec 2021 Anthony DiGiovanni, Ambuj Tewari

We study the problem of guaranteeing low regret in repeated games against an opponent with unknown membership in one of several classes.

On the Statistical Benefits of Curriculum Learning

no code implementations13 Nov 2021 Ziping Xu, Ambuj Tewari

For both settings, we derive the minimax rates for CL with the oracle that provides the optimal curriculum and without the oracle, where the agent has to adaptively learn a good curriculum.

Learning in Online MDPs: Is there a Price for Handling the Communicating Case?

no code implementations3 Nov 2021 Gautam Chandrasekaran, Ambuj Tewari

In contrast, it has been shown that handling online MDPs with communicating structure and bandit information incurs $\Omega(T^{2/3})$ regret even in the case of deterministic transitions.

Bandit Algorithms for Precision Medicine

no code implementations10 Aug 2021 Yangyi Lu, Ziping Xu, Ambuj Tewari

However, the modern precision medicine movement has been enabled by a confluence of events: scientific advances in fields such as genetics and pharmacology, technological advances in mobile devices and wearable sensors, and methodological advances in computing and data sciences.

Weighted Gaussian Process Bandits for Non-stationary Environments

no code implementations6 Jul 2021 Yuntian Deng, Xingyu Zhou, Baekjin Kim, Ambuj Tewari, Abhishek Gupta, Ness Shroff

To this end, we develop WGP-UCB, a novel UCB-type algorithm based on weighted Gaussian process regression.

regression

Causal Bandits with Unknown Graph Structure

no code implementations NeurIPS 2021 Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

In causal bandit problems, the action set consists of interventions on variables of a causal graph.

Representation Learning Beyond Linear Prediction Functions

no code implementations NeurIPS 2021 Ziping Xu, Ambuj Tewari

This motivates us to ask whether diversity can be achieved when source tasks and the target task use different prediction function spaces beyond linear functions.

Representation Learning

Causal Markov Decision Processes: Learning Good Interventions Efficiently

no code implementations15 Feb 2021 Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

We introduce causal Markov Decision Processes (C-MDPs), a new formalism for sequential decision making which combines the standard MDP formulation with causal structures over state transition and reward functions.

Decision Making Marketing

Federated Learning via Synthetic Data

no code implementations11 Aug 2020 Jack Goetz, Ambuj Tewari

Federated learning allows for the training of a model using data on multiple clients without the clients transmitting that raw data.

Federated Learning

Low-Rank Generalized Linear Bandit Problems

no code implementations4 Jun 2020 Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

To get around the computational intractability of covering based approaches, we propose an efficient algorithm by extending the "Explore-Subspace-Then-Refine" algorithm of~\citet{jun2019bilinear}.

On Learnability under General Stochastic Processes

no code implementations15 May 2020 A. Philip Dawid, Ambuj Tewari

Statistical learning theory under independent and identically distributed (iid) sampling and online learning theory for worst case individual sequences are two of the best developed branches of learning theory.

Binary Classification Learning Theory +1

Randomized Exploration for Non-Stationary Stochastic Linear Bandits

2 code implementations11 Dec 2019 Baekjin Kim, Ambuj Tewari

We investigate two perturbation approaches to overcome conservatism that optimism based algorithms chronically suffer from in practice.

Computational Efficiency Thompson Sampling

Online Boosting for Multilabel Ranking with Top-k Feedback

no code implementations24 Oct 2019 Vinod Raman, Daniel T. Zhang, Young Hun Jung, Ambuj Tewari

We present online boosting algorithms for multilabel ranking with top-k feedback, where the learner only receives information about the top k items from the ranking it provides.

Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles

no code implementations23 Oct 2019 Aditya Modi, Nan Jiang, Ambuj Tewari, Satinder Singh

As an extension, we also consider the more challenging problem of model selection, where the state features are unknown and can be chosen from a large candidate set.

Model Selection reinforcement-learning +1

Thompson Sampling in Non-Episodic Restless Bandits

no code implementations12 Oct 2019 Young Hun Jung, Marc Abeille, Ambuj Tewari

Restless bandit problems assume time-varying reward distributions of the arms, which adds flexibility to the model but makes the analysis more challenging.

Open-Ended Question Answering Thompson Sampling

What You See May Not Be What You Get: UCB Bandit Algorithms Robust to ε-Contamination

no code implementations12 Oct 2019 Laura Niss, Ambuj Tewari

We define the $\varepsilon$-contaminated stochastic bandit problem and use our robust mean estimators to give two variants of a robust Upper Confidence Bound (UCB) algorithm, crUCB.

Regret Analysis of Bandit Problems with Causal Background Knowledge

no code implementations11 Oct 2019 Yangyi Lu, Amirhossein Meisami, Ambuj Tewari, Zhenyu Yan

For example, we observe that even with a few hundreds of iterations, the regret of causal algorithms is less than that of standard algorithms by a factor of three.

Thompson Sampling

Not All are Made Equal: Consistency of Weighted Averaging Estimators Under Active Learning

no code implementations11 Oct 2019 Jack Goetz, Ambuj Tewari

We generalize Stone's Theorem in the noise free setting, proving consistency for well known classifiers such as $k$-NN, histogram and kernel estimators under conditions which mirror classical results.

Active Learning

Regret Bounds for Thompson Sampling in Episodic Restless Bandit Problems

1 code implementation NeurIPS 2019 Young Hun Jung, Ambuj Tewari

These problems have been studied well from the optimization perspective, where the goal is to efficiently find a near-optimal policy when system parameters are known.

Multi-Armed Bandits Thompson Sampling

Generalization Bounds in the Predict-then-Optimize Framework

no code implementations NeurIPS 2019 Othman El Balghiti, Adam N. Elmachtoub, Paul Grigas, Ambuj Tewari

A natural loss function in this environment is to consider the cost of the decisions induced by the predicted parameters, in contrast to the prediction error of the parameters.

Generalization Bounds

Randomized Algorithms for Data-Driven Stabilization of Stochastic Linear Systems

no code implementations16 May 2019 Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

We provide numerical analyses for the performance of two methods: stochastic feedback, and stochastic parameter.

No-regret Exploration in Contextual Reinforcement Learning

no code implementations14 Mar 2019 Aditya Modi, Ambuj Tewari

We consider the recently proposed reinforcement learning (RL) framework of Contextual Markov Decision Processes (CMDP), where the agent interacts with a (potentially adversarial) sequence of episodic tabular MDPs.

reinforcement-learning Reinforcement Learning (RL)

On the Optimality of Perturbations in Stochastic and Adversarial Multi-armed Bandit Problems

2 code implementations NeurIPS 2019 Baekjin Kim, Ambuj Tewari

We investigate the optimality of perturbation based algorithms in the stochastic and adversarial multi-armed bandit problems.

Active Learning for Non-Parametric Regression Using Purely Random Trees

1 code implementation NeurIPS 2018 Jack Goetz, Ambuj Tewari, Paul Zimmerman

Active learning is the task of using labelled data to select additional points to label, with the goal of fitting the most accurate model with a fixed budget of labelled points.

Active Learning Binary Classification +2

Input Perturbations for Adaptive Control and Learning

no code implementations10 Nov 2018 Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

This paper studies adaptive algorithms for simultaneous regulation (i. e., control) and estimation (i. e., learning) of Multiple Input Multiple Output (MIMO) linear dynamical systems.

Online Multiclass Boosting with Bandit Feedback

1 code implementation11 Oct 2018 Daniel T. Zhang, Young Hun Jung, Ambuj Tewari

We propose an unbiased estimate of the loss using a randomized prediction, allowing the model to update its weak learners with limited information.

General Classification

Fighting Contextual Bandits with Stochastic Smoothing

no code implementations11 Oct 2018 Young Hun Jung, Ambuj Tewari

We propose a general algorithm template that represents random perturbation based algorithms and identify several perturbation distributions that lead to strong regret bounds.

Multi-Armed Bandits

On the Approximation Properties of Random ReLU Features

1 code implementation10 Oct 2018 Yitong Sun, Anna Gilbert, Ambuj Tewari

We study the approximation properties of random ReLU features through their reproducing kernel Hilbert space (RKHS).

But How Does It Work in Theory? Linear SVM with Random Features

1 code implementation NeurIPS 2018 Yitong Sun, Anna Gilbert, Ambuj Tewari

We prove that, under low noise assumptions, the support vector machine with $N\ll m$ random features (RFSVM) can achieve the learning rate faster than $O(1/\sqrt{m})$ on a training set with $m$ samples when an optimized feature map is used.

feature selection

Finite Time Adaptive Stabilization of LQ Systems

no code implementations22 Jul 2018 Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

There are only a few existing non-asymptotic results and a full treatment of the problem is not currently available.

Online Learning via the Differential Privacy Lens

no code implementations NeurIPS 2019 Jacob Abernethy, Young Hun Jung, Chansoo Lee, Audra McMillan, Ambuj Tewari

In this paper, we use differential privacy as a lens to examine online learning in both full and partial information settings.

Multi-Armed Bandits

Optimism-Based Adaptive Regulation of Linear-Quadratic Systems

no code implementations20 Nov 2017 Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

The main challenge for adaptive regulation of linear-quadratic systems is the trade-off between identification and control.

Markov Decision Processes with Continuous Side Information

no code implementations15 Nov 2017 Aditya Modi, Nan Jiang, Satinder Singh, Ambuj Tewari

Because our lower bound has an exponential dependence on the dimension, we consider a tractable linear setting where the context is used to create linear combinations of a finite set of MDPs.

PAC learning Reinforcement Learning (RL)

Online Boosting Algorithms for Multi-label Ranking

no code implementations23 Oct 2017 Young Hun Jung, Ambuj Tewari

We consider the multi-label ranking approach to multi-label learning.

Multi-Label Learning

An Actor-Critic Contextual Bandit Algorithm for Personalized Mobile Health Interventions

no code implementations28 Jun 2017 Huitian Lei, Yangyi Lu, Ambuj Tewari, Susan A. Murphy

Increasing technological sophistication and widespread use of smartphones and wearable devices provide opportunities for innovative and highly personalized health interventions.

Online Multiclass Boosting

1 code implementation NeurIPS 2017 Young Hun Jung, Jack Goetz, Ambuj Tewari

Recent work has extended the theoretical analysis of boosting algorithms to multiclass problems and to online settings.

Binary Classification General Classification

Beyond the Hazard Rate: More Perturbation Algorithms for Adversarial Multi-armed Bandits

no code implementations17 Feb 2017 Zifan Li, Ambuj Tewari

Assuming that the hazard rate is bounded, it is possible to provide regret analyses for a variety of FTPL algorithms for the multi-armed bandit problem.

Multi-Armed Bandits

Sampled Fictitious Play is Hannan Consistent

no code implementations5 Oct 2016 Zifan Li, Ambuj Tewari

Fictitious play is a simple and widely studied adaptive heuristic for playing repeated games.

Phased Exploration with Greedy Exploitation in Stochastic Combinatorial Partial Monitoring Games

no code implementations NeurIPS 2016 Sougata Chaudhuri, Ambuj Tewari

The implementation of their algorithm depends on two separate offline oracles and the distribution dependent regret additionally requires existence of a unique optimal action for the learner.

Online Learning to Rank with Top-k Feedback

no code implementations23 Aug 2016 Sougata Chaudhuri, Ambuj Tewari

We consider two settings of online learning to rank where feedback is restricted to top ranked items.

Learning-To-Rank

Mixture Proportion Estimation via Kernel Embedding of Distributions

no code implementations8 Mar 2016 Harish G. Ramaswamy, Clayton Scott, Ambuj Tewari

Mixture proportion estimation (MPE) is the problem of estimating the weight of a component distribution in a mixture, given samples from the mixture and component.

Anomaly Detection Weakly-supervised Learning

Online Learning to Rank with Feedback at the Top

no code implementations6 Mar 2016 Sougata Chaudhuri, Ambuj Tewari

We consider an online learning to rank setting in which, at each round, an oblivious adversary generates a list of $m$ documents, pertaining to a query, and the learner produces scores to rank the documents.

Learning-To-Rank

Generalization error bounds for learning to rank: Does the length of document lists matter?

no code implementations6 Mar 2016 Ambuj Tewari, Sougata Chaudhuri

We consider the generalization ability of algorithms for learning to rank at a query level, a problem also called subset ranking.

Learning-To-Rank

Lasso Guarantees for Time Series Estimation Under Subgaussian Tails and $ β$-Mixing

no code implementations12 Feb 2016 Kam Chung Wong, Zifan Li, Ambuj Tewari

Many theoretical results on estimation of high dimensional time series require specifying an underlying data generating model (DGM).

Time Series Time Series Analysis

Fighting Bandits with a New Kind of Smoothness

no code implementations NeurIPS 2015 Jacob Abernethy, Chansoo Lee, Ambuj Tewari

We define a novel family of algorithms for the adversarial multi-armed bandit problem, and provide a simple analysis technique based on convex smoothing.

Alternating Minimization for Regression Problems with Vector-valued Outputs

no code implementations NeurIPS 2015 Prateek Jain, Ambuj Tewari

In regression problems involving vector-valued outputs (or equivalently, multiple responses), it is well known that the maximum likelihood estimator (MLE), which takes noise covariance structure into account, can be significantly more accurate than the ordinary least squares (OLS) estimator.

regression

Handling Class Imbalance in Link Prediction using Learning to Rank Techniques

no code implementations13 Nov 2015 Bopeng Li, Sougata Chaudhuri, Ambuj Tewari

We consider the link prediction problem in a partially observed network, where the objective is to make predictions in the unobserved portion of the network.

Binary Classification Learning-To-Rank +1

Perceptron like Algorithms for Online Learning to Rank

no code implementations4 Aug 2015 Sougata Chaudhuri, Ambuj Tewari

We show that, if there exists a perfect oracle ranker which can correctly rank each instance in an online sequence of ranking data, with some margin, the cumulative loss of perceptron algorithm on that sequence is bounded by a constant, irrespective of the length of the sequence.

General Classification Information Retrieval +2

Spectral Smoothing via Random Matrix Perturbations

no code implementations10 Jul 2015 Jacob Abernethy, Chansoo Lee, Ambuj Tewari

Smoothing the maximum eigenvalue function is important for applications in semidefinite optimization and online learning.

Consistent Algorithms for Multiclass Classification with a Reject Option

no code implementations15 May 2015 Harish G. Ramaswamy, Ambuj Tewari, Shivani Agarwal

We consider the problem of $n$-class classification ($n\geq 2$), where the classifier can choose to abstain from making predictions at a given cost, say, a factor $\alpha$ of the cost of misclassification.

Classification General Classification

On Iterative Hard Thresholding Methods for High-dimensional M-Estimation

no code implementations NeurIPS 2014 Prateek Jain, Ambuj Tewari, Purushottam Kar

Our results rely on a general analysis framework that enables us to analyze several popular hard thresholding style algorithms (such as HTP, CoSaMP, SP) in the high dimensional regression setting.

regression Vocal Bursts Intensity Prediction

Online Ranking with Top-1 Feedback

no code implementations5 Oct 2014 Sougata Chaudhuri, Ambuj Tewari

We consider a setting where a system learns to rank a fixed set of $m$ items.

Online Linear Optimization via Smoothing

no code implementations23 May 2014 Jacob Abernethy, Chansoo Lee, Abhinav Sinha, Ambuj Tewari

We present a new optimization-theoretic approach to analyzing Follow-the-Leader style algorithms, particularly in the setting where perturbations are used as a tool for regularization.

Perceptron-like Algorithms and Generalization Bounds for Learning to Rank

no code implementations3 May 2014 Sougata Chaudhuri, Ambuj Tewari

En route to developing the online algorithm and generalization bound, we propose a novel family of listwise large margin ranking surrogates.

Generalization Bounds Learning-To-Rank +1

On Lipschitz Continuity and Smoothness of Loss Functions in Learning to Rank

no code implementations3 May 2014 Ambuj Tewari, Sougata Chaudhuri

In binary classification and regression problems, it is well understood that Lipschitz continuity and smoothness of the loss function play key roles in governing generalization error bounds for empirical risk minimization algorithms.

Binary Classification Learning-To-Rank

Convex Calibrated Surrogates for Low-Rank Loss Matrices with Applications to Subset Ranking Losses

no code implementations NeurIPS 2013 Harish G. Ramaswamy, Shivani Agarwal, Ambuj Tewari

The design of convex, calibrated surrogate losses, whose minimization entails consistency with respect to a desired target loss, is an important concept to have emerged in the theory of machine learning in recent years.

Learning with Noisy Labels

no code implementations NeurIPS 2013 Nagarajan Natarajan, Inderjit S. Dhillon, Pradeep K. Ravikumar, Ambuj Tewari

In this paper, we theoretically study the problem of binary classification in the presence of random classification noise --- the learner, instead of seeing the true labels, sees labels that have independently been flipped with some small probability.

Binary Classification General Classification +1

Online Learning: Stochastic, Constrained, and Smoothed Adversaries

no code implementations NeurIPS 2011 Alexander Rakhlin, Karthik Sridharan, Ambuj Tewari

We define the minimax value of a game where the adversary is restricted in his moves, capturing stochastic and non-stochastic assumptions on data.

Learning Theory

Greedy Algorithms for Structurally Constrained High Dimensional Problems

no code implementations NeurIPS 2011 Ambuj Tewari, Pradeep K. Ravikumar, Inderjit S. Dhillon

A hallmark of modern machine learning is its ability to deal with high dimensional problems by exploiting structural assumptions that limit the degrees of freedom in the underlying model.

Vocal Bursts Intensity Prediction

Nearest Neighbor based Greedy Coordinate Descent

no code implementations NeurIPS 2011 Inderjit S. Dhillon, Pradeep K. Ravikumar, Ambuj Tewari

In particular, we investigate the greedy coordinate descent algorithm, and note that performing the greedy step efficiently weakens the costly dependence on the problem size provided the solution is sparse.

On the Universality of Online Mirror Descent

no code implementations NeurIPS 2011 Nati Srebro, Karthik Sridharan, Ambuj Tewari

We show that for a general class of convex online learning problems, Mirror Descent can always achieve a (nearly) optimal regret guarantee.

Orthogonal Matching Pursuit with Replacement

no code implementations NeurIPS 2011 Prateek Jain, Ambuj Tewari, Inderjit S. Dhillon

Our proof techniques are novel and flexible enough to also permit the tightest known analysis of popular iterative algorithms such as CoSaMP and Subspace Pursuit.

Smoothness, Low Noise and Fast Rates

no code implementations NeurIPS 2010 Nathan Srebro, Karthik Sridharan, Ambuj Tewari

We establish an excess risk bound of O(H R_n^2 + sqrt{H L*} R_n) for ERM with an H-smooth loss function and a hypothesis class with Rademacher complexity R_n, where L* is the best risk achievable by the hypothesis class.

Online Learning via Sequential Complexities

no code implementations6 Jun 2010 Alexander Rakhlin, Karthik Sridharan, Ambuj Tewari

We consider the problem of sequential prediction and provide tools to study the minimax value of the associated game.

Learning Theory

Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity

no code implementations31 Oct 2009 Sham M. Kakade, Ohad Shamir, Karthik Sridharan, Ambuj Tewari

The versatility of exponential families, along with their attendant convexity properties, make them a popular and effective statistical model.

Vocal Bursts Intensity Prediction

On the Generalization Ability of Online Strongly Convex Programming Algorithms

no code implementations NeurIPS 2008 Sham M. Kakade, Ambuj Tewari

This paper examines the generalization properties of online convex programming algorithms when the loss function is Lipschitz and strongly convex.

Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs

no code implementations NeurIPS 2007 Ambuj Tewari, Peter L. Bartlett

OLP is closely related to an algorithm proposed by Burnetas and Katehakis with four key differences: OLP is simpler, it does not require knowledge of the supports of transition probabilities and the proof of the regret bound is simpler, but our regret bound is a constant factor larger than the regret of their algorithm.

Cannot find the paper you are looking for? You can Submit a new open access paper.