Search Results for author: Ambuj Tewari

Found 75 papers, 9 papers with code

On the Statistical Benefits of Curriculum Learning

no code implementations13 Nov 2021 Ziping Xu, Ambuj Tewari

For both settings, we derive the minimax rates for CL with the oracle that provides the optimal curriculum and without the oracle, where the agent has to adaptively learn a good curriculum.

Curriculum Learning

Online Learning in Adversarial MDPs: Is the Communicating Case Harder than Ergodic?

no code implementations3 Nov 2021 Gautam Chandrasekaran, Ambuj Tewari

We study online learning in adversarial communicating Markov Decision Processes with full information.

Bandit Algorithms for Precision Medicine

no code implementations10 Aug 2021 Yangyi Lu, Ziping Xu, Ambuj Tewari

However, the modern precision medicine movement has been enabled by a confluence of events: scientific advances in fields such as genetics and pharmacology, technological advances in mobile devices and wearable sensors, and methodological advances in computing and data sciences.

Weighted Gaussian Process Bandits for Non-stationary Environments

no code implementations6 Jul 2021 Yuntian Deng, Xingyu Zhou, Baekjin Kim, Ambuj Tewari, Abhishek Gupta, Ness Shroff

To this end, we develop WGP-UCB, a novel UCB-type algorithm based on weighted Gaussian process regression.

Causal Bandits with Unknown Graph Structure

no code implementations NeurIPS 2021 Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

In causal bandit problems, the action set consists of interventions on variables of a causal graph.

Representation Learning Beyond Linear Prediction Functions

no code implementations NeurIPS 2021 Ziping Xu, Ambuj Tewari

This motivates us to ask whether diversity can be achieved when source tasks and the target task use different prediction function spaces beyond linear functions.

Representation Learning

Causal Markov Decision Processes: Learning Good Interventions Efficiently

no code implementations15 Feb 2021 Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

We introduce causal Markov Decision Processes (C-MDPs), a new formalism for sequential decision making which combines the standard MDP formulation with causal structures over state transition and reward functions.

Decision Making

Federated Learning via Synthetic Data

no code implementations11 Aug 2020 Jack Goetz, Ambuj Tewari

Federated learning allows for the training of a model using data on multiple clients without the clients transmitting that raw data.

Federated Learning

Low-Rank Generalized Linear Bandit Problems

no code implementations4 Jun 2020 Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

To get around the computational intractability of covering based approaches, we propose an efficient algorithm by extending the "Explore-Subspace-Then-Refine" algorithm of~\citet{jun2019bilinear}.

On Learnability under General Stochastic Processes

no code implementations15 May 2020 A. Philip Dawid, Ambuj Tewari

Statistical learning theory under independent and identically distributed (iid) sampling and online learning theory for worst case individual sequences are two of the best developed branches of learning theory.

Learning Theory

Randomized Exploration for Non-Stationary Stochastic Linear Bandits

2 code implementations11 Dec 2019 Baekjin Kim, Ambuj Tewari

We investigate two perturbation approaches to overcome conservatism that optimism based algorithms chronically suffer from in practice.

Online Boosting for Multilabel Ranking with Top-k Feedback

no code implementations24 Oct 2019 Vinod Raman, Daniel T. Zhang, Young Hun Jung, Ambuj Tewari

We present online boosting algorithms for multilabel ranking with top-k feedback, where the learner only receives information about the top k items from the ranking it provides.

Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles

no code implementations23 Oct 2019 Aditya Modi, Nan Jiang, Ambuj Tewari, Satinder Singh

As an extension, we also consider the more challenging problem of model selection, where the state features are unknown and can be chosen from a large candidate set.

Model Selection

Thompson Sampling in Non-Episodic Restless Bandits

no code implementations12 Oct 2019 Young Hun Jung, Marc Abeille, Ambuj Tewari

Restless bandit problems assume time-varying reward distributions of the arms, which adds flexibility to the model but makes the analysis more challenging.

What You See May Not Be What You Get: UCB Bandit Algorithms Robust to ε-Contamination

no code implementations12 Oct 2019 Laura Niss, Ambuj Tewari

We define the $\varepsilon$-contaminated stochastic bandit problem and use our robust mean estimators to give two variants of a robust Upper Confidence Bound (UCB) algorithm, crUCB.

Not All are Made Equal: Consistency of Weighted Averaging Estimators Under Active Learning

no code implementations11 Oct 2019 Jack Goetz, Ambuj Tewari

We generalize Stone's Theorem in the noise free setting, proving consistency for well known classifiers such as $k$-NN, histogram and kernel estimators under conditions which mirror classical results.

Active Learning

Regret Analysis of Bandit Problems with Causal Background Knowledge

no code implementations11 Oct 2019 Yangyi Lu, Amirhossein Meisami, Ambuj Tewari, Zhenyu Yan

For example, we observe that even with a few hundreds of iterations, the regret of causal algorithms is less than that of standard algorithms by a factor of three.

Regret Bounds for Thompson Sampling in Episodic Restless Bandit Problems

1 code implementation NeurIPS 2019 Young Hun Jung, Ambuj Tewari

These problems have been studied well from the optimization perspective, where the goal is to efficiently find a near-optimal policy when system parameters are known.

Multi-Armed Bandits

Generalization Bounds in the Predict-then-Optimize Framework

no code implementations NeurIPS 2019 Othman El Balghiti, Adam N. Elmachtoub, Paul Grigas, Ambuj Tewari

A natural loss function in this environment is to consider the cost of the decisions induced by the predicted parameters, in contrast to the prediction error of the parameters.

Generalization Bounds

Randomized Algorithms for Data-Driven Stabilization of Stochastic Linear Systems

no code implementations16 May 2019 Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

We provide numerical analyses for the performance of two methods: stochastic feedback, and stochastic parameter.

On Applications of Bootstrap in Continuous Space Reinforcement Learning

no code implementations14 Mar 2019 Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

In decision making problems for continuous state and action spaces, linear dynamical models are widely employed.

Decision Making

No-regret Exploration in Contextual Reinforcement Learning

no code implementations14 Mar 2019 Aditya Modi, Ambuj Tewari

We consider the recently proposed reinforcement learning (RL) framework of Contextual Markov Decision Processes (CMDP), where the agent interacts with a (potentially adversarial) sequence of episodic tabular MDPs.

On the Optimality of Perturbations in Stochastic and Adversarial Multi-armed Bandit Problems

2 code implementations NeurIPS 2019 Baekjin Kim, Ambuj Tewari

We investigate the optimality of perturbation based algorithms in the stochastic and adversarial multi-armed bandit problems.

Active Learning for Non-Parametric Regression Using Purely Random Trees

1 code implementation NeurIPS 2018 Jack Goetz, Ambuj Tewari, Paul Zimmerman

Active learning is the task of using labelled data to select additional points to label, with the goal of fitting the most accurate model with a fixed budget of labelled points.

Active Learning General Classification

Input Perturbations for Adaptive Control and Learning

no code implementations10 Nov 2018 Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

This paper studies adaptive algorithms for simultaneous regulation (i. e., control) and estimation (i. e., learning) of Multiple Input Multiple Output (MIMO) linear dynamical systems.

Online Multiclass Boosting with Bandit Feedback

1 code implementation11 Oct 2018 Daniel T. Zhang, Young Hun Jung, Ambuj Tewari

We propose an unbiased estimate of the loss using a randomized prediction, allowing the model to update its weak learners with limited information.

General Classification

Fighting Contextual Bandits with Stochastic Smoothing

no code implementations11 Oct 2018 Young Hun Jung, Ambuj Tewari

We propose a general algorithm template that represents random perturbation based algorithms and identify several perturbation distributions that lead to strong regret bounds.

Multi-Armed Bandits

On the Approximation Properties of Random ReLU Features

1 code implementation10 Oct 2018 Yitong Sun, Anna Gilbert, Ambuj Tewari

We study the approximation properties of random ReLU features through their reproducing kernel Hilbert space (RKHS).

But How Does It Work in Theory? Linear SVM with Random Features

1 code implementation NeurIPS 2018 Yitong Sun, Anna Gilbert, Ambuj Tewari

We prove that, under low noise assumptions, the support vector machine with $N\ll m$ random features (RFSVM) can achieve the learning rate faster than $O(1/\sqrt{m})$ on a training set with $m$ samples when an optimized feature map is used.

Feature Selection

Finite Time Adaptive Stabilization of LQ Systems

no code implementations22 Jul 2018 Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

There are only a few existing non-asymptotic results and a full treatment of the problem is not currently available.

Online Learning via the Differential Privacy Lens

no code implementations NeurIPS 2019 Jacob Abernethy, Young Hun Jung, Chansoo Lee, Audra McMillan, Ambuj Tewari

In this paper, we use differential privacy as a lens to examine online learning in both full and partial information settings.

Multi-Armed Bandits

Optimism-Based Adaptive Regulation of Linear-Quadratic Systems

no code implementations20 Nov 2017 Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

The main challenge for adaptive regulation of linear-quadratic systems is the trade-off between identification and control.

Markov Decision Processes with Continuous Side Information

no code implementations15 Nov 2017 Aditya Modi, Nan Jiang, Satinder Singh, Ambuj Tewari

Because our lower bound has an exponential dependence on the dimension, we consider a tractable linear setting where the context is used to create linear combinations of a finite set of MDPs.

Online Boosting Algorithms for Multi-label Ranking

no code implementations23 Oct 2017 Young Hun Jung, Ambuj Tewari

We consider the multi-label ranking approach to multi-label learning.

Multi-Label Learning

An Actor-Critic Contextual Bandit Algorithm for Personalized Mobile Health Interventions

no code implementations28 Jun 2017 Huitian Lei, Ambuj Tewari, Susan A. Murphy

Increasing technological sophistication and widespread use of smartphones and wearable devices provide opportunities for innovative and highly personalized health interventions.

Online Multiclass Boosting

1 code implementation NeurIPS 2017 Young Hun Jung, Jack Goetz, Ambuj Tewari

Recent work has extended the theoretical analysis of boosting algorithms to multiclass problems and to online settings.

General Classification

Beyond the Hazard Rate: More Perturbation Algorithms for Adversarial Multi-armed Bandits

no code implementations17 Feb 2017 Zifan Li, Ambuj Tewari

Assuming that the hazard rate is bounded, it is possible to provide regret analyses for a variety of FTPL algorithms for the multi-armed bandit problem.

Multi-Armed Bandits

Sampled Fictitious Play is Hannan Consistent

no code implementations5 Oct 2016 Zifan Li, Ambuj Tewari

Fictitious play is a simple and widely studied adaptive heuristic for playing repeated games.

Online Learning to Rank with Top-k Feedback

no code implementations23 Aug 2016 Sougata Chaudhuri, Ambuj Tewari

We consider two settings of online learning to rank where feedback is restricted to top ranked items.

Learning-To-Rank

Phased Exploration with Greedy Exploitation in Stochastic Combinatorial Partial Monitoring Games

no code implementations NeurIPS 2016 Sougata Chaudhuri, Ambuj Tewari

The implementation of their algorithm depends on two separate offline oracles and the distribution dependent regret additionally requires existence of a unique optimal action for the learner.

Mixture Proportion Estimation via Kernel Embedding of Distributions

no code implementations8 Mar 2016 Harish G. Ramaswamy, Clayton Scott, Ambuj Tewari

Mixture proportion estimation (MPE) is the problem of estimating the weight of a component distribution in a mixture, given samples from the mixture and component.

Anomaly Detection

Online Learning to Rank with Feedback at the Top

no code implementations6 Mar 2016 Sougata Chaudhuri, Ambuj Tewari

We consider an online learning to rank setting in which, at each round, an oblivious adversary generates a list of $m$ documents, pertaining to a query, and the learner produces scores to rank the documents.

Learning-To-Rank

Generalization error bounds for learning to rank: Does the length of document lists matter?

no code implementations6 Mar 2016 Ambuj Tewari, Sougata Chaudhuri

We consider the generalization ability of algorithms for learning to rank at a query level, a problem also called subset ranking.

Learning-To-Rank

Lasso Guarantees for Time Series Estimation Under Subgaussian Tails and $ β$-Mixing

no code implementations12 Feb 2016 Kam Chung Wong, Zifan Li, Ambuj Tewari

Many theoretical results on estimation of high dimensional time series require specifying an underlying data generating model (DGM).

Time Series

Fighting Bandits with a New Kind of Smoothness

no code implementations NeurIPS 2015 Jacob Abernethy, Chansoo Lee, Ambuj Tewari

We define a novel family of algorithms for the adversarial multi-armed bandit problem, and provide a simple analysis technique based on convex smoothing.

Alternating Minimization for Regression Problems with Vector-valued Outputs

no code implementations NeurIPS 2015 Prateek Jain, Ambuj Tewari

In regression problems involving vector-valued outputs (or equivalently, multiple responses), it is well known that the maximum likelihood estimator (MLE), which takes noise covariance structure into account, can be significantly more accurate than the ordinary least squares (OLS) estimator.

Handling Class Imbalance in Link Prediction using Learning to Rank Techniques

no code implementations13 Nov 2015 Bopeng Li, Sougata Chaudhuri, Ambuj Tewari

We consider the link prediction problem in a partially observed network, where the objective is to make predictions in the unobserved portion of the network.

Learning-To-Rank Link Prediction

Perceptron like Algorithms for Online Learning to Rank

no code implementations4 Aug 2015 Sougata Chaudhuri, Ambuj Tewari

We show that, if there exists a perfect oracle ranker which can correctly rank each instance in an online sequence of ranking data, with some margin, the cumulative loss of perceptron algorithm on that sequence is bounded by a constant, irrespective of the length of the sequence.

General Classification Information Retrieval +1

Spectral Smoothing via Random Matrix Perturbations

no code implementations10 Jul 2015 Jacob Abernethy, Chansoo Lee, Ambuj Tewari

Smoothing the maximum eigenvalue function is important for applications in semidefinite optimization and online learning.

Consistent Algorithms for Multiclass Classification with a Reject Option

no code implementations15 May 2015 Harish G. Ramaswamy, Ambuj Tewari, Shivani Agarwal

We consider the problem of $n$-class classification ($n\geq 2$), where the classifier can choose to abstain from making predictions at a given cost, say, a factor $\alpha$ of the cost of misclassification.

Classification General Classification

On Iterative Hard Thresholding Methods for High-dimensional M-Estimation

no code implementations NeurIPS 2014 Prateek Jain, Ambuj Tewari, Purushottam Kar

Our results rely on a general analysis framework that enables us to analyze several popular hard thresholding style algorithms (such as HTP, CoSaMP, SP) in the high dimensional regression setting.

Online Ranking with Top-1 Feedback

no code implementations5 Oct 2014 Sougata Chaudhuri, Ambuj Tewari

We consider a setting where a system learns to rank a fixed set of $m$ items.

Online Linear Optimization via Smoothing

no code implementations23 May 2014 Jacob Abernethy, Chansoo Lee, Abhinav Sinha, Ambuj Tewari

We present a new optimization-theoretic approach to analyzing Follow-the-Leader style algorithms, particularly in the setting where perturbations are used as a tool for regularization.

On Lipschitz Continuity and Smoothness of Loss Functions in Learning to Rank

no code implementations3 May 2014 Ambuj Tewari, Sougata Chaudhuri

In binary classification and regression problems, it is well understood that Lipschitz continuity and smoothness of the loss function play key roles in governing generalization error bounds for empirical risk minimization algorithms.

Learning-To-Rank

Perceptron-like Algorithms and Generalization Bounds for Learning to Rank

no code implementations3 May 2014 Sougata Chaudhuri, Ambuj Tewari

En route to developing the online algorithm and generalization bound, we propose a novel family of listwise large margin ranking surrogates.

Generalization Bounds Learning-To-Rank +1

Convex Calibrated Surrogates for Low-Rank Loss Matrices with Applications to Subset Ranking Losses

no code implementations NeurIPS 2013 Harish G. Ramaswamy, Shivani Agarwal, Ambuj Tewari

The design of convex, calibrated surrogate losses, whose minimization entails consistency with respect to a desired target loss, is an important concept to have emerged in the theory of machine learning in recent years.

Learning with Noisy Labels

no code implementations NeurIPS 2013 Nagarajan Natarajan, Inderjit S. Dhillon, Pradeep K. Ravikumar, Ambuj Tewari

In this paper, we theoretically study the problem of binary classification in the presence of random classification noise --- the learner, instead of seeing the true labels, sees labels that have independently been flipped with some small probability.

Classification General Classification +1

Orthogonal Matching Pursuit with Replacement

no code implementations NeurIPS 2011 Prateek Jain, Ambuj Tewari, Inderjit S. Dhillon

Our proof techniques are novel and flexible enough to also permit the tightest known analysis of popular iterative algorithms such as CoSaMP and Subspace Pursuit.

Greedy Algorithms for Structurally Constrained High Dimensional Problems

no code implementations NeurIPS 2011 Ambuj Tewari, Pradeep K. Ravikumar, Inderjit S. Dhillon

A hallmark of modern machine learning is its ability to deal with high dimensional problems by exploiting structural assumptions that limit the degrees of freedom in the underlying model.

Online Learning: Stochastic, Constrained, and Smoothed Adversaries

no code implementations NeurIPS 2011 Alexander Rakhlin, Karthik Sridharan, Ambuj Tewari

We define the minimax value of a game where the adversary is restricted in his moves, capturing stochastic and non-stochastic assumptions on data.

Learning Theory

Nearest Neighbor based Greedy Coordinate Descent

no code implementations NeurIPS 2011 Inderjit S. Dhillon, Pradeep K. Ravikumar, Ambuj Tewari

In particular, we investigate the greedy coordinate descent algorithm, and note that performing the greedy step efficiently weakens the costly dependence on the problem size provided the solution is sparse.

On the Universality of Online Mirror Descent

no code implementations NeurIPS 2011 Nati Srebro, Karthik Sridharan, Ambuj Tewari

We show that for a general class of convex online learning problems, Mirror Descent can always achieve a (nearly) optimal regret guarantee.

Smoothness, Low Noise and Fast Rates

no code implementations NeurIPS 2010 Nathan Srebro, Karthik Sridharan, Ambuj Tewari

We establish an excess risk bound of O(H R_n^2 + sqrt{H L*} R_n) for ERM with an H-smooth loss function and a hypothesis class with Rademacher complexity R_n, where L* is the best risk achievable by the hypothesis class.

Online Learning via Sequential Complexities

no code implementations6 Jun 2010 Alexander Rakhlin, Karthik Sridharan, Ambuj Tewari

We consider the problem of sequential prediction and provide tools to study the minimax value of the associated game.

Learning Theory

Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity

no code implementations31 Oct 2009 Sham M. Kakade, Ohad Shamir, Karthik Sridharan, Ambuj Tewari

The versatility of exponential families, along with their attendant convexity properties, make them a popular and effective statistical model.

On the Generalization Ability of Online Strongly Convex Programming Algorithms

no code implementations NeurIPS 2008 Sham M. Kakade, Ambuj Tewari

This paper examines the generalization properties of online convex programming algorithms when the loss function is Lipschitz and strongly convex.

Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs

no code implementations NeurIPS 2007 Ambuj Tewari, Peter L. Bartlett

OLP is closely related to an algorithm proposed by Burnetas and Katehakis with four key differences: OLP is simpler, it does not require knowledge of the supports of transition probabilities and the proof of the regret bound is simpler, but our regret bound is a constant factor larger than the regret of their algorithm.

Cannot find the paper you are looking for? You can Submit a new open access paper.