Search Results for author: Ambuj Tewari

Found 106 papers, 16 papers with code

Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs

no code implementations • NeurIPS 2007 • Ambuj Tewari, Peter L. Bartlett

OLP is closely related to an algorithm proposed by Burnetas and Katehakis with four key differences: OLP is simpler, it does not require knowledge of the supports of transition probabilities and the proof of the regret bound is simpler, but our regret bound is a constant factor larger than the regret of their algorithm.

Paper
Add Code

On the Generalization Ability of Online Strongly Convex Programming Algorithms

no code implementations • NeurIPS 2008 • Sham M. Kakade, Ambuj Tewari

This paper examines the generalization properties of online convex programming algorithms when the loss function is Lipschitz and strongly convex.

Paper
Add Code

On the Complexity of Linear Prediction: Risk Bounds, Margin Bounds, and Regularization

no code implementations • NeurIPS 2008 • Sham M. Kakade, Karthik Sridharan, Ambuj Tewari

We provide sharp bounds for Rademacher and Gaussian complexities of (constrained) linear classes.

Paper
Add Code

Learning Exponential Families in High-Dimensions: Strong Convexity and Sparsity

no code implementations • 31 Oct 2009 • Sham M. Kakade, Ohad Shamir, Karthik Sridharan, Ambuj Tewari

The versatility of exponential families, along with their attendant convexity properties, make them a popular and effective statistical model.

Vocal Bursts Intensity Prediction

Paper
Add Code

Online Learning via Sequential Complexities

no code implementations • 6 Jun 2010 • Alexander Rakhlin, Karthik Sridharan, Ambuj Tewari

We consider the problem of sequential prediction and provide tools to study the minimax value of the associated game.

Learning Theory

Paper
Add Code

Online Learning: Random Averages, Combinatorial Parameters, and Learnability

no code implementations • NeurIPS 2010 • Alexander Rakhlin, Karthik Sridharan, Ambuj Tewari

We develop a theory of online learning by defining several complexity measures.

Learning Theory

Paper
Add Code

Smoothness, Low Noise and Fast Rates

no code implementations • NeurIPS 2010 • Nathan Srebro, Karthik Sridharan, Ambuj Tewari

We establish an excess risk bound of O(H R_n^2 + sqrt{H L*} R_n) for ERM with an H-smooth loss function and a hypothesis class with Rademacher complexity R_n, where L* is the best risk achievable by the hypothesis class.

Paper
Add Code

Greedy Algorithms for Structurally Constrained High Dimensional Problems

no code implementations • NeurIPS 2011 • Ambuj Tewari, Pradeep K. Ravikumar, Inderjit S. Dhillon

A hallmark of modern machine learning is its ability to deal with high dimensional problems by exploiting structural assumptions that limit the degrees of freedom in the underlying model.

Vocal Bursts Intensity Prediction

Paper
Add Code

On the Universality of Online Mirror Descent

no code implementations • NeurIPS 2011 • Nati Srebro, Karthik Sridharan, Ambuj Tewari

We show that for a general class of convex online learning problems, Mirror Descent can always achieve a (nearly) optimal regret guarantee.

Paper
Add Code

Nearest Neighbor based Greedy Coordinate Descent

no code implementations • NeurIPS 2011 • Inderjit S. Dhillon, Pradeep K. Ravikumar, Ambuj Tewari

In particular, we investigate the greedy coordinate descent algorithm, and note that performing the greedy step efficiently weakens the costly dependence on the problem size provided the solution is sparse.

Paper
Add Code

Online Learning: Stochastic, Constrained, and Smoothed Adversaries

no code implementations • NeurIPS 2011 • Alexander Rakhlin, Karthik Sridharan, Ambuj Tewari

We define the minimax value of a game where the adversary is restricted in his moves, capturing stochastic and non-stochastic assumptions on data.

Learning Theory

Paper
Add Code

Orthogonal Matching Pursuit with Replacement

no code implementations • NeurIPS 2011 • Prateek Jain, Ambuj Tewari, Inderjit S. Dhillon

Our proof techniques are novel and flexible enough to also permit the tightest known analysis of popular iterative algorithms such as CoSaMP and Subspace Pursuit.

Paper
Add Code

Feature Clustering for Accelerating Parallel Coordinate Descent

no code implementations • NeurIPS 2012 • Chad Scherrer, Ambuj Tewari, Mahantesh Halappanavar, David Haglin

We give a unified convergence analysis for the family of block-greedy algorithms.

Clustering

Paper
Add Code

Convex Calibrated Surrogates for Low-Rank Loss Matrices with Applications to Subset Ranking Losses

no code implementations • NeurIPS 2013 • Harish G. Ramaswamy, Shivani Agarwal, Ambuj Tewari

The design of convex, calibrated surrogate losses, whose minimization entails consistency with respect to a desired target loss, is an important concept to have emerged in the theory of machine learning in recent years.

Paper
Add Code

Learning with Noisy Labels

no code implementations • NeurIPS 2013 • Nagarajan Natarajan, Inderjit S. Dhillon, Pradeep K. Ravikumar, Ambuj Tewari

In this paper, we theoretically study the problem of binary classification in the presence of random classification noise --- the learner, instead of seeing the true labels, sees labels that have independently been flipped with some small probability.

Binary Classification General Classification +1

Paper
Add Code

On Lipschitz Continuity and Smoothness of Loss Functions in Learning to Rank

no code implementations • 3 May 2014 • Ambuj Tewari, Sougata Chaudhuri

In binary classification and regression problems, it is well understood that Lipschitz continuity and smoothness of the loss function play key roles in governing generalization error bounds for empirical risk minimization algorithms.

Binary Classification Learning-To-Rank

Paper
Add Code

Perceptron-like Algorithms and Generalization Bounds for Learning to Rank

no code implementations • 3 May 2014 • Sougata Chaudhuri, Ambuj Tewari

En route to developing the online algorithm and generalization bound, we propose a novel family of listwise large margin ranking surrogates.

Generalization Bounds Learning-To-Rank +1

Paper
Add Code

Online Linear Optimization via Smoothing

no code implementations • 23 May 2014 • Jacob Abernethy, Chansoo Lee, Abhinav Sinha, Ambuj Tewari

We present a new optimization-theoretic approach to analyzing Follow-the-Leader style algorithms, particularly in the setting where perturbations are used as a tool for regularization.

Paper
Add Code

Online Ranking with Top-1 Feedback

no code implementations • 5 Oct 2014 • Sougata Chaudhuri, Ambuj Tewari

We consider a setting where a system learns to rank a fixed set of $m$ items.

Paper
Add Code

On Iterative Hard Thresholding Methods for High-dimensional M-Estimation

no code implementations • NeurIPS 2014 • Prateek Jain, Ambuj Tewari, Purushottam Kar

Our results rely on a general analysis framework that enables us to analyze several popular hard thresholding style algorithms (such as HTP, CoSaMP, SP) in the high dimensional regression setting.

regression Vocal Bursts Intensity Prediction

Paper
Add Code

Consistent Algorithms for Multiclass Classification with a Reject Option

no code implementations • 15 May 2015 • Harish G. Ramaswamy, Ambuj Tewari, Shivani Agarwal

We consider the problem of $n$-class classification ($n\geq 2$), where the classifier can choose to abstain from making predictions at a given cost, say, a factor $\alpha$ of the cost of misclassification.

Classification General Classification

Paper
Add Code

Spectral Smoothing via Random Matrix Perturbations

no code implementations • 10 Jul 2015 • Jacob Abernethy, Chansoo Lee, Ambuj Tewari

Smoothing the maximum eigenvalue function is important for applications in semidefinite optimization and online learning.

Paper
Add Code

Perceptron like Algorithms for Online Learning to Rank

no code implementations • 4 Aug 2015 • Sougata Chaudhuri, Ambuj Tewari

We show that, if there exists a perfect oracle ranker which can correctly rank each instance in an online sequence of ranking data, with some margin, the cumulative loss of perceptron algorithm on that sequence is bounded by a constant, irrespective of the length of the sequence.

General Classification Information Retrieval +2

Paper
Add Code

Handling Class Imbalance in Link Prediction using Learning to Rank Techniques

no code implementations • 13 Nov 2015 • Bopeng Li, Sougata Chaudhuri, Ambuj Tewari

We consider the link prediction problem in a partially observed network, where the objective is to make predictions in the unobserved portion of the network.

Binary Classification Learning-To-Rank +1

Paper
Add Code

Predtron: A Family of Online Algorithms for General Prediction Problems

no code implementations • NeurIPS 2015 • Prateek Jain, Nagarajan Natarajan, Ambuj Tewari

We offer a general framework to derive mistake driven online algorithms and associated loss bounds.

Binary Classification Classification +2

Paper
Add Code

Alternating Minimization for Regression Problems with Vector-valued Outputs

no code implementations • NeurIPS 2015 • Prateek Jain, Ambuj Tewari

In regression problems involving vector-valued outputs (or equivalently, multiple responses), it is well known that the maximum likelihood estimator (MLE), which takes noise covariance structure into account, can be significantly more accurate than the ordinary least squares (OLS) estimator.

regression

Paper
Add Code

Fighting Bandits with a New Kind of Smoothness

no code implementations • NeurIPS 2015 • Jacob Abernethy, Chansoo Lee, Ambuj Tewari

We define a novel family of algorithms for the adversarial multi-armed bandit problem, and provide a simple analysis technique based on convex smoothing.

Paper
Add Code

Lasso Guarantees for Time Series Estimation Under Subgaussian Tails and $ β$-Mixing

no code implementations • 12 Feb 2016 • Kam Chung Wong, Zifan Li, Ambuj Tewari

Many theoretical results on estimation of high dimensional time series require specifying an underlying data generating model (DGM).

Time Series Time Series Analysis

Paper
Add Code

Generalization error bounds for learning to rank: Does the length of document lists matter?

no code implementations • 6 Mar 2016 • Ambuj Tewari, Sougata Chaudhuri

We consider the generalization ability of algorithms for learning to rank at a query level, a problem also called subset ranking.

Learning-To-Rank

Paper
Add Code

Online Learning to Rank with Feedback at the Top

no code implementations • 6 Mar 2016 • Sougata Chaudhuri, Ambuj Tewari

We consider an online learning to rank setting in which, at each round, an oblivious adversary generates a list of $m$ documents, pertaining to a query, and the learner produces scores to rank the documents.

Learning-To-Rank

Paper
Add Code

Mixture Proportion Estimation via Kernel Embedding of Distributions

no code implementations • 8 Mar 2016 • Harish G. Ramaswamy, Clayton Scott, Ambuj Tewari

Mixture proportion estimation (MPE) is the problem of estimating the weight of a component distribution in a mixture, given samples from the mixture and component.

Anomaly Detection Weakly-supervised Learning

Paper
Add Code

Online Learning to Rank with Top-k Feedback

no code implementations • 23 Aug 2016 • Sougata Chaudhuri, Ambuj Tewari

We consider two settings of online learning to rank where feedback is restricted to top ranked items.

Learning-To-Rank

Paper
Add Code

Phased Exploration with Greedy Exploitation in Stochastic Combinatorial Partial Monitoring Games

no code implementations • NeurIPS 2016 • Sougata Chaudhuri, Ambuj Tewari

The implementation of their algorithm depends on two separate offline oracles and the distribution dependent regret additionally requires existence of a unique optimal action for the learner.

Paper
Add Code

Sampled Fictitious Play is Hannan Consistent

no code implementations • 5 Oct 2016 • Zifan Li, Ambuj Tewari

Fictitious play is a simple and widely studied adaptive heuristic for playing repeated games.

Paper
Add Code

Beyond the Hazard Rate: More Perturbation Algorithms for Adversarial Multi-armed Bandits

no code implementations • 17 Feb 2017 • Zifan Li, Ambuj Tewari

Assuming that the hazard rate is bounded, it is possible to provide regret analyses for a variety of FTPL algorithms for the multi-armed bandit problem.

Multi-Armed Bandits

Paper
Add Code

Online Multiclass Boosting

1 code implementation • NeurIPS 2017 • Young Hun Jung, Jack Goetz, Ambuj Tewari

Recent work has extended the theoretical analysis of boosting algorithms to multiclass problems and to online settings.

Binary Classification General Classification

Paper
Code

An Actor-Critic Contextual Bandit Algorithm for Personalized Mobile Health Interventions

no code implementations • 28 Jun 2017 • Huitian Lei, Yangyi Lu, Ambuj Tewari, Susan A. Murphy

Increasing technological sophistication and widespread use of smartphones and wearable devices provide opportunities for innovative and highly personalized health interventions.

Paper
Add Code

Online Boosting Algorithms for Multi-label Ranking

no code implementations • 23 Oct 2017 • Young Hun Jung, Ambuj Tewari

We consider the multi-label ranking approach to multi-label learning.

Multi-Label Learning

Paper
Add Code

Markov Decision Processes with Continuous Side Information

no code implementations • 15 Nov 2017 • Aditya Modi, Nan Jiang, Satinder Singh, Ambuj Tewari

Because our lower bound has an exponential dependence on the dimension, we consider a tractable linear setting where the context is used to create linear combinations of a finite set of MDPs.

PAC learning Reinforcement Learning (RL)

Paper
Add Code

Optimism-Based Adaptive Regulation of Linear-Quadratic Systems

no code implementations • 20 Nov 2017 • Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

The main challenge for adaptive regulation of linear-quadratic systems is the trade-off between identification and control.

Paper
Add Code

Online Learning via the Differential Privacy Lens

no code implementations • NeurIPS 2019 • Jacob Abernethy, Young Hun Jung, Chansoo Lee, Audra McMillan, Ambuj Tewari

In this paper, we use differential privacy as a lens to examine online learning in both full and partial information settings.

Multi-Armed Bandits

Paper
Add Code

Finite Time Adaptive Stabilization of LQ Systems

no code implementations • 22 Jul 2018 • Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

There are only a few existing non-asymptotic results and a full treatment of the problem is not currently available.

Paper
Add Code

But How Does It Work in Theory? Linear SVM with Random Features

1 code implementation • NeurIPS 2018 • Yitong Sun, Anna Gilbert, Ambuj Tewari

We prove that, under low noise assumptions, the support vector machine with $N\ll m$ random features (RFSVM) can achieve the learning rate faster than $O(1/\sqrt{m})$ on a training set with $m$ samples when an optimized feature map is used.

feature selection

Paper
Code

On the Approximation Properties of Random ReLU Features

1 code implementation • 10 Oct 2018 • Yitong Sun, Anna Gilbert, Ambuj Tewari

We study the approximation properties of random ReLU features through their reproducing kernel Hilbert space (RKHS).

Paper
Code

Fighting Contextual Bandits with Stochastic Smoothing

no code implementations • 11 Oct 2018 • Young Hun Jung, Ambuj Tewari

We propose a general algorithm template that represents random perturbation based algorithms and identify several perturbation distributions that lead to strong regret bounds.

Multi-Armed Bandits

Paper
Add Code

Online Multiclass Boosting with Bandit Feedback

1 code implementation • 11 Oct 2018 • Daniel T. Zhang, Young Hun Jung, Ambuj Tewari

We propose an unbiased estimate of the loss using a randomized prediction, allowing the model to update its weak learners with limited information.

General Classification

Paper
Code

Input Perturbations for Adaptive Control and Learning

no code implementations • 10 Nov 2018 • Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

This paper studies adaptive algorithms for simultaneous regulation (i. e., control) and estimation (i. e., learning) of Multiple Input Multiple Output (MIMO) linear dynamical systems.

Paper
Add Code

Active Learning for Non-Parametric Regression Using Purely Random Trees

1 code implementation • NeurIPS 2018 • Jack Goetz, Ambuj Tewari, Paul Zimmerman

Active learning is the task of using labelled data to select additional points to label, with the goal of fitting the most accurate model with a fixed budget of labelled points.

Active Learning Binary Classification +2

Paper
Code

On the Optimality of Perturbations in Stochastic and Adversarial Multi-armed Bandit Problems

2 code implementations • NeurIPS 2019 • Baekjin Kim, Ambuj Tewari

We investigate the optimality of perturbation based algorithms in the stochastic and adversarial multi-armed bandit problems.

Paper
Code

No-regret Exploration in Contextual Reinforcement Learning

no code implementations • 14 Mar 2019 • Aditya Modi, Ambuj Tewari

We consider the recently proposed reinforcement learning (RL) framework of Contextual Markov Decision Processes (CMDP), where the agent interacts with a (potentially adversarial) sequence of episodic tabular MDPs.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

On Applications of Bootstrap in Continuous Space Reinforcement Learning

no code implementations • 14 Mar 2019 • Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

In decision making problems for continuous state and action spaces, linear dynamical models are widely employed.

Decision Making reinforcement-learning +1

Paper
Add Code

Randomized Algorithms for Data-Driven Stabilization of Stochastic Linear Systems

no code implementations • 16 May 2019 • Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

We provide numerical analyses for the performance of two methods: stochastic feedback, and stochastic parameter.

Paper
Add Code

Generalization Bounds in the Predict-then-Optimize Framework

no code implementations • NeurIPS 2019 • Othman El Balghiti, Adam N. Elmachtoub, Paul Grigas, Ambuj Tewari

A natural loss function in this environment is to consider the cost of the decisions induced by the predicted parameters, in contrast to the prediction error of the parameters.

Generalization Bounds

Paper
Add Code

Regret Bounds for Thompson Sampling in Episodic Restless Bandit Problems

1 code implementation • NeurIPS 2019 • Young Hun Jung, Ambuj Tewari

These problems have been studied well from the optimization perspective, where the goal is to efficiently find a near-optimal policy when system parameters are known.

Multi-Armed Bandits Thompson Sampling

Paper
Code

Regret Analysis of Bandit Problems with Causal Background Knowledge

no code implementations • 11 Oct 2019 • Yangyi Lu, Amirhossein Meisami, Ambuj Tewari, Zhenyu Yan

For example, we observe that even with a few hundreds of iterations, the regret of causal algorithms is less than that of standard algorithms by a factor of three.

Thompson Sampling

Paper
Add Code

Not All are Made Equal: Consistency of Weighted Averaging Estimators Under Active Learning

no code implementations • 11 Oct 2019 • Jack Goetz, Ambuj Tewari

We generalize Stone's Theorem in the noise free setting, proving consistency for well known classifiers such as $k$-NN, histogram and kernel estimators under conditions which mirror classical results.

Active Learning

Paper
Add Code

Thompson Sampling in Non-Episodic Restless Bandits

no code implementations • 12 Oct 2019 • Young Hun Jung, Marc Abeille, Ambuj Tewari

Restless bandit problems assume time-varying reward distributions of the arms, which adds flexibility to the model but makes the analysis more challenging.

Open-Ended Question Answering Thompson Sampling

Paper
Add Code

What You See May Not Be What You Get: UCB Bandit Algorithms Robust to ε-Contamination

no code implementations • 12 Oct 2019 • Laura Niss, Ambuj Tewari

We define the $\varepsilon$-contaminated stochastic bandit problem and use our robust mean estimators to give two variants of a robust Upper Confidence Bound (UCB) algorithm, crUCB.

Paper
Add Code

Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles

no code implementations • 23 Oct 2019 • Aditya Modi, Nan Jiang, Ambuj Tewari, Satinder Singh

As an extension, we also consider the more challenging problem of model selection, where the state features are unknown and can be chosen from a large candidate set.

Model Selection reinforcement-learning +1

Paper
Add Code

Online Boosting for Multilabel Ranking with Top-k Feedback

no code implementations • 24 Oct 2019 • Vinod Raman, Daniel T. Zhang, Young Hun Jung, Ambuj Tewari

We present online boosting algorithms for multilabel ranking with top-k feedback, where the learner only receives information about the top k items from the ranking it provides.

Paper
Add Code

Randomized Exploration for Non-Stationary Stochastic Linear Bandits

2 code implementations • 11 Dec 2019 • Baekjin Kim, Ambuj Tewari

We investigate two perturbation approaches to overcome conservatism that optimism based algorithms chronically suffer from in practice.

Computational Efficiency Thompson Sampling

Paper
Code

Reinforcement Learning in Factored MDPs: Oracle-Efficient Algorithms and Tighter Regret Bounds for the Non-Episodic Setting

no code implementations • NeurIPS 2020 • Ziping Xu, Ambuj Tewari

We study reinforcement learning in non-episodic factored Markov decision processes (FMDPs).

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

On Learnability under General Stochastic Processes

no code implementations • 15 May 2020 • A. Philip Dawid, Ambuj Tewari

Statistical learning theory under independent and identically distributed (iid) sampling and online learning theory for worst case individual sequences are two of the best developed branches of learning theory.

Binary Classification Learning Theory +1

Paper
Add Code

On the Equivalence between Online and Private Learnability beyond Binary Classification

no code implementations • NeurIPS 2020 • Young Hun Jung, Baekjin Kim, Ambuj Tewari

First, we show that private learnability implies online learnability in both settings.

Binary Classification Classification +3

Paper
Add Code

Low-Rank Generalized Linear Bandit Problems

no code implementations • 4 Jun 2020 • Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

To get around the computational intractability of covering based approaches, we propose an efficient algorithm by extending the "Explore-Subspace-Then-Refine" algorithm of~\citet{jun2019bilinear}.

Paper
Add Code

TorsionNet: A Reinforcement Learning Approach to Sequential Conformer Search

2 code implementations • NeurIPS 2020 • Tarun Gogineni, Ziping Xu, Exequiel Punzalan, Runxuan Jiang, Joshua Kammeraad, Ambuj Tewari, Paul Zimmerman

Molecular geometry prediction of flexible molecules, or conformer search, is a long-standing challenge in computational chemistry.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Federated Learning via Synthetic Data

no code implementations • 11 Aug 2020 • Jack Goetz, Ambuj Tewari

Federated learning allows for the training of a model using data on multiple clients without the clients transmitting that raw data.

Federated Learning

Paper
Add Code

Decision Making Problems with Funnel Structure: A Multi-Task Learning Approach with Application to Email Marketing Campaigns

no code implementations • 15 Oct 2020 • Ziping Xu, Amirhossein Meisami, Ambuj Tewari

We analyze both the prediction error and the regret of our algorithms.

Decision Making Marketing +1

Paper
Add Code

Causal Markov Decision Processes: Learning Good Interventions Efficiently

no code implementations • 15 Feb 2021 • Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

We introduce causal Markov Decision Processes (C-MDPs), a new formalism for sequential decision making which combines the standard MDP formulation with causal structures over state transition and reward functions.

Decision Making Marketing

Paper
Add Code

Representation Learning Beyond Linear Prediction Functions

no code implementations • NeurIPS 2021 • Ziping Xu, Ambuj Tewari

This motivates us to ask whether diversity can be achieved when source tasks and the target task use different prediction function spaces beyond linear functions.

Representation Learning

Paper
Add Code

Causal Bandits with Unknown Graph Structure

no code implementations • NeurIPS 2021 • Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

In causal bandit problems, the action set consists of interventions on variables of a causal graph.

Paper
Add Code

Weighted Gaussian Process Bandits for Non-stationary Environments

no code implementations • 6 Jul 2021 • Yuntian Deng, Xingyu Zhou, Baekjin Kim, Ambuj Tewari, Abhishek Gupta, Ness Shroff

To this end, we develop WGP-UCB, a novel UCB-type algorithm based on weighted Gaussian process regression.

regression

Paper
Add Code

Bandit Algorithms for Precision Medicine

no code implementations • 10 Aug 2021 • Yangyi Lu, Ziping Xu, Ambuj Tewari

However, the modern precision medicine movement has been enabled by a confluence of events: scientific advances in fields such as genetics and pharmacology, technological advances in mobile devices and wearable sensors, and methodological advances in computing and data sciences.

Paper
Add Code

Learning in Online MDPs: Is there a Price for Handling the Communicating Case?

no code implementations • 3 Nov 2021 • Gautam Chandrasekaran, Ambuj Tewari

In contrast, it has been shown that handling online MDPs with communicating structure and bandit information incurs $\Omega(T^{2/3})$ regret even in the case of deterministic transitions.

Paper
Add Code

On the Statistical Benefits of Curriculum Learning

no code implementations • 13 Nov 2021 • Ziping Xu, Ambuj Tewari

For both settings, we derive the minimax rates for CL with the oracle that provides the optimal curriculum and without the oracle, where the agent has to adaptively learn a good curriculum.

Paper
Add Code

Balancing Adaptability and Non-exploitability in Repeated Games

1 code implementation • 20 Dec 2021 • Anthony DiGiovanni, Ambuj Tewari

We study the problem of guaranteeing low regret in repeated games against an opponent with unknown membership in one of several classes.

Paper
Code

Joint Learning of Linear Time-Invariant Dynamical Systems

no code implementations • 21 Dec 2021 • Aditya Modi, Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

Linear time-invariant systems are very popular models in system theory and applications.

Paper
Add Code

Achieving Representative Data via Convex Hull Feasibility Sampling Algorithms

no code implementations • 13 Apr 2022 • Laura Niss, Yuekai Sun, Ambuj Tewari

Sampling biases in training data are a major source of algorithmic biases in machine learning systems.

Paper
Add Code

An Optimization-based Algorithm for Non-stationary Kernel Bandits without Prior Knowledge

no code implementations • 29 May 2022 • Kihyuk Hong, Yuhang Li, Ambuj Tewari

Moreover, when applied to the non-stationary linear bandit setting by using a linear kernel, our algorithm is nearly minimax optimal, solving an open problem in the non-stationary linear bandit literature.

Paper
Add Code

Adaptive Sampling for Discovery

no code implementations • 30 May 2022 • Ziping Xu, Eunjae Shim, Ambuj Tewari, Paul Zimmerman

Starting with a large unlabeled dataset, algorithms for ASD adaptively label the points with the goal to maximize the sum of responses.

Decision Making Drug Discovery

Paper
Add Code

Online Agnostic Multiclass Boosting

1 code implementation • 30 May 2022 • Vinod Raman, Ambuj Tewari

In this way, boosting algorithms convert weak learners into strong ones.

Binary Classification

Paper
Code

Thompson Sampling for High-Dimensional Sparse Linear Contextual Bandits

1 code implementation • 11 Nov 2022 • Sunrit Chakraborty, Saptarshi Roy, Ambuj Tewari

We consider the stochastic linear contextual bandit problem with high-dimensional features.

Multi-Armed Bandits Thompson Sampling +2

Paper
Code

Learning Mixtures of Markov Chains and MDPs

1 code implementation • 17 Nov 2022 • Chinmaya Kausik, Kevin Tan, Ambuj Tewari

We present an algorithm for learning mixtures of Markov chains and Markov decision processes (MDPs) from short unlabeled trajectories.

Paper
Code

RL Boltzmann Generators for Conformer Generation in Data-Sparse Environments

1 code implementation • 19 Nov 2022 • Yash Patel, Ambuj Tewari

The generation of conformers has been a long-standing interest to structural chemists and biologists alike.

Paper
Code

Offline Policy Evaluation and Optimization under Confounding

no code implementations • 29 Nov 2022 • Chinmaya Kausik, Yangyi Lu, Kevin Tan, Maggie Makar, Yixin Wang, Ambuj Tewari

Evaluating and optimizing policies in the presence of unobserved confounders is a problem of growing interest in offline reinforcement learning.

Offline RL Off-policy evaluation

Paper
Add Code

A Characterization of Multioutput Learnability

no code implementations • 6 Jan 2023 • Vinod Raman, Unique Subedi, Ambuj Tewari

This provides a complete characterization of the learnability of multilabel classification and multioutput regression in both batch and online settings.

regression

Paper
Add Code

Understanding Best Subset Selection: A Tale of Two C(omplex)ities

no code implementations • 16 Jan 2023 • Saptarshi Roy, Ambuj Tewari, Ziwei Zhu

Furthermore, we show that a margin condition depending on similar margin quantity and complexity measures is also necessary for model consistency of BSS.

Model Selection Variable Selection +1

Paper
Add Code

An Asymptotically Optimal Algorithm for the Convex Hull Membership Problem

no code implementations • 3 Feb 2023 • Gang Qiao, Ambuj Tewari

This work studies the pure-exploration setting for the convex hull membership (CHM) problem where one aims to efficiently and accurately determine if a given point lies in the convex hull of means of a finite set of distributions.

Paper
Add Code

Quantum Learning Theory Beyond Batch Binary Classification

no code implementations • 15 Feb 2023 • Preetham Mohan, Ambuj Tewari

Arunachalam and de Wolf (2018) showed that the sample complexity of quantum batch learning of boolean functions, in the realizable and agnostic settings, has the same form and order as the corresponding classical sample complexities.

Binary Classification Classification +1

Paper
Add Code

Multiclass Online Learning and Uniform Convergence

no code implementations • 30 Mar 2023 • Steve Hanneke, Shay Moran, Vinod Raman, Unique Subedi, Ambuj Tewari

We argue that the best expert has regret at most Littlestone dimension relative to the best concept in the class.

Binary Classification

Paper
Add Code

Amortized Variational Inference with Coverage Guarantees

no code implementations • 23 May 2023 • Yash Patel, Declan McNamara, Jackson Loper, Jeffrey Regier, Ambuj Tewari

We prove lower bounds on the predictive efficiency of the regions produced by CANVI and explore how the quality of a posterior approximation relates to the predictive efficiency of prediction regions based on that approximation.

Variational Inference

Paper
Add Code

Online Learning with Set-Valued Feedback

no code implementations • 9 Jun 2023 • Vinod Raman, Unique Subedi, Ambuj Tewari

We study a variant of online multiclass classification where the learner predicts a single label but receives a \textit{set of labels} as feedback.

Paper
Add Code

A Primal-Dual-Critic Algorithm for Offline Constrained Reinforcement Learning

no code implementations • 13 Jun 2023 • Kihyuk Hong, Yuhang Li, Ambuj Tewari

Offline constrained reinforcement learning (RL) aims to learn a policy that maximizes the expected cumulative reward subject to constraints on expected cumulative cost using an existing dataset.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

A Combinatorial Characterization of Supervised Online Learnability

no code implementations • 7 Jul 2023 • Vinod Raman, Unique Subedi, Ambuj Tewari

We study the online learnability of hypothesis classes with respect to arbitrary, but bounded loss functions.

Learning Theory regression

Paper
Add Code

Multiclass Online Learnability under Bandit Feedback

no code implementations • 8 Aug 2023 • Ananth Raman, Vinod Raman, Unique Subedi, Idan Mehalel, Ambuj Tewari

We study online multiclass classification under bandit feedback.

Paper
Add Code

On the Minimax Regret in Online Ranking with Top-k Feedback

no code implementations • 5 Sep 2023 • Mingyuan Zhang, Ambuj Tewari

In online ranking, a learning algorithm sequentially ranks a set of items and receives feedback on its ranking in the form of relevance scores.

Paper
Add Code

Online Infinite-Dimensional Regression: Learning Linear Operators

no code implementations • 8 Sep 2023 • Vinod Raman, Unique Subedi, Ambuj Tewari

Finally, we prove that the impossibility result and the separation between uniform convergence and learnability also hold in the batch setting.

regression

Paper
Add Code

On the Computational Complexity of Private High-dimensional Model Selection

1 code implementation • 11 Oct 2023 • Saptarshi Roy, Zehua Wang, Ambuj Tewari

We consider the problem of model selection in a high-dimensional sparse linear regression model under privacy constraints.

Model Selection regression

Paper
Code

Conformal Contextual Robust Optimization

no code implementations • 16 Oct 2023 • Yash Patel, Sahana Rayan, Ambuj Tewari

Data-driven approaches to predict-then-optimize decision-making problems seek to mitigate the risk of uncertainty region misspecification in safety-critical settings.

Conformal Prediction Decision Making

Paper
Add Code

Sequence Length Independent Norm-Based Generalization Bounds for Transformers

1 code implementation • 19 Oct 2023 • Jacob Trauger, Ambuj Tewari

This paper provides norm-based generalization bounds for the Transformer architecture that do not depend on the input sequence length.

Generalization Bounds

Paper
Code

Apple Tasting: Combinatorial Dimensions and Minimax Rates

no code implementations • 29 Oct 2023 • Vinod Raman, Unique Subedi, Ananth Raman, Ambuj Tewari

In particular, we show that in the realizable setting, the expected number of mistakes of any learner, under apple tasting feedback, can be $\Theta(1), \Theta(\sqrt{T})$, or $\Theta(T)$.

Binary Classification

Paper
Add Code

A Framework for Partially Observed Reward-States in RLHF

no code implementations • 5 Feb 2024 • Chinmaya Kausik, Mirco Mutti, Aldo Pacchiano, Ambuj Tewari

We show reductions from the the two dominant forms of human feedback in RLHF - cardinal and dueling feedback to PORRL.

reinforcement-learning

Paper
Add Code

A Primal-Dual Algorithm for Offline Constrained Reinforcement Learning with Low-Rank MDPs

no code implementations • 7 Feb 2024 • Kihyuk Hong, Ambuj Tewari

Our algorithm is the first computationally efficient algorithm in this setting that achieves sample complexity of $O(\epsilon^{-2})$ with partial data coverage assumption.

Offline RL Reinforcement Learning (RL)

Paper
Add Code

The Complexity of Sequential Prediction in Dynamical Systems

no code implementations • 9 Feb 2024 • Vinod Raman, Unique Subedi, Ambuj Tewari

We study the problem of learning to predict the next state of a dynamical system when the underlying evolution function is unknown.

Learning Theory

Paper
Add Code

Optimal Thresholding Linear Bandit

no code implementations • 11 Feb 2024 • Eduardo Ochoa Rivera, Ambuj Tewari

We study a novel pure exploration problem: the $\epsilon$-Thresholding Bandit Problem (TBP) with fixed confidence in stochastic linear bandits.

Paper
Add Code

Sample Efficient Myopic Exploration Through Multitask Reinforcement Learning with Diverse Tasks

no code implementations • 3 Mar 2024 • Ziping Xu, Zifan Xu, Runxuan Jiang, Peter Stone, Ambuj Tewari

Multitask Reinforcement Learning (MTRL) approaches have gained increasing attention for its wide applications in many important Reinforcement Learning (RL) tasks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.