Search Results for author: Alekh Agarwal

Found 83 papers, 22 papers with code

Offline Imitation Learning from Multiple Baselines with Applications to Compiler Optimization

no code implementations • 28 Mar 2024 • Teodor V. Marinov, Alekh Agarwal, Mircea Trofin

This work studies a Reinforcement Learning (RL) problem in which we are given a set of trajectories collected with K baseline policies.

Compiler Optimization Imitation Learning +1

Paper
Add Code

Stochastic Gradient Succeeds for Bandits

no code implementations • 27 Feb 2024 • Jincheng Mei, Zixin Zhong, Bo Dai, Alekh Agarwal, Csaba Szepesvari, Dale Schuurmans

We show that the \emph{stochastic gradient} bandit algorithm converges to a \emph{globally optimal} policy at an $O(1/t)$ rate, even with a \emph{constant} step size.

Paper
Add Code

More Benefits of Being Distributional: Second-Order Bounds for Reinforcement Learning

no code implementations • 11 Feb 2024 • Kaiwen Wang, Owen Oertell, Alekh Agarwal, Nathan Kallus, Wen Sun

Second-order bounds are instance-dependent bounds that scale with the variance of return, which we prove are tighter than the previously known small-loss bounds of distributional RL.

Distributional Reinforcement Learning Multi-Armed Bandits +1

Paper
Add Code

A Minimaximalist Approach to Reinforcement Learning from Human Feedback

no code implementations • 8 Jan 2024 • Gokul Swamy, Christoph Dann, Rahul Kidambi, Zhiwei Steven Wu, Alekh Agarwal

Our approach is maximalist in that it provably handles non-Markovian, intransitive, and stochastic preferences while being robust to the compounding errors that plague offline approaches to sequential prediction.

Continuous Control reinforcement-learning

Paper
Add Code

Theoretical guarantees on the best-of-n alignment policy

no code implementations • 3 Jan 2024 • Ahmad Beirami, Alekh Agarwal, Jonathan Berant, Alexander D'Amour, Jacob Eisenstein, Chirag Nagpal, Ananda Theertha Suresh

A commonly used analytical expression in the literature claims that the KL divergence between the best-of-$n$ policy and the base policy is equal to $\log (n) - (n-1)/n.$ We disprove the validity of this claim, and show that it is an upper bound on the actual KL divergence.

Paper
Add Code

Helping or Herding? Reward Model Ensembles Mitigate but do not Eliminate Reward Hacking

no code implementations • 14 Dec 2023 • Jacob Eisenstein, Chirag Nagpal, Alekh Agarwal, Ahmad Beirami, Alex D'Amour, DJ Dvijotham, Adam Fisch, Katherine Heller, Stephen Pfohl, Deepak Ramachandran, Peter Shaw, Jonathan Berant

However, even pretrain reward ensembles do not eliminate reward hacking: we show several qualitative reward hacking phenomena that are not mitigated by ensembling because all reward models in the ensemble exhibit similar error patterns.

Language Modelling

Paper
Add Code

Efficient End-to-End Visual Document Understanding with Rationale Distillation

no code implementations • 16 Nov 2023 • Wang Zhu, Alekh Agarwal, Mandar Joshi, Robin Jia, Jesse Thomason, Kristina Toutanova

Pre-processing tools, such as optical character recognition (OCR), can map document image inputs to textual tokens, then large language models (LLMs) can reason over text.

document understanding Optical Character Recognition +1

Paper
Add Code

A Mechanism for Sample-Efficient In-Context Learning for Sparse Retrieval Tasks

no code implementations • 26 May 2023 • Jacob Abernethy, Alekh Agarwal, Teodor V. Marinov, Manfred K. Warmuth

We study the phenomenon of \textit{in-context learning} (ICL) exhibited by large language models, where they can adapt to a new learning task, given a handful of labeled examples, without any explicit parameter optimization.

In-Context Learning Retrieval

Paper
Add Code

An Empirical Evaluation of Federated Contextual Bandit Algorithms

1 code implementation • 17 Mar 2023 • Alekh Agarwal, H. Brendan McMahan, Zheng Xu

As the adoption of federated learning increases for learning from sensitive data local to user devices, it is natural to ask if the learning can be done using implicit signals generated as users interact with the applications of interest, rather than requiring access to explicit labels which can be difficult to acquire in many tasks.

Federated Learning Multi-Armed Bandits

646

Paper
Code

Leveraging User-Triggered Supervision in Contextual Bandits

no code implementations • 7 Feb 2023 • Alekh Agarwal, Claudio Gentile, Teodor V. Marinov

We study contextual bandit (CB) problems, where the user can sometimes respond with the best action in a given context.

Multi-Armed Bandits

Paper
Add Code

Learning in POMDPs is Sample-Efficient with Hindsight Observability

no code implementations • 31 Jan 2023 • Jonathan N. Lee, Alekh Agarwal, Christoph Dann, Tong Zhang

POMDPs capture a broad class of decision making problems, but hardness results suggest that learning is intractable even in simple settings due to the inherent partial observability.

Decision Making Scheduling

Paper
Add Code

VO$Q$L: Towards Optimal Regret in Model-free RL with Nonlinear Function Approximation

no code implementations • 12 Dec 2022 • Alekh Agarwal, Yujia Jin, Tong Zhang

We study time-inhomogeneous episodic reinforcement learning (RL) under general function approximation and sparse rewards.

Q-Learning regression +1

Paper
Add Code

On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL

no code implementations • 21 Jun 2022 • Jinglin Chen, Aditya Modi, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal

We study reward-free reinforcement learning (RL) under general non-linear function approximation, and establish sample efficiency and hardness results under various standard structural assumptions.

Reinforcement Learning (RL)

Paper
Add Code

Model-based RL with Optimistic Posterior Sampling: Structural Conditions and Sample Complexity

no code implementations • 15 Jun 2022 • Alekh Agarwal, Tong Zhang

We propose a general framework to design posterior sampling methods for model-based RL.

Paper
Add Code

Provable Benefits of Representational Transfer in Reinforcement Learning

1 code implementation • 29 May 2022 • Alekh Agarwal, Yuda Song, Wen Sun, Kaiwen Wang, Mengdi Wang, Xuezhou Zhang

We study the problem of representational transfer in RL, where an agent first pretrains in a number of source tasks to discover a shared representation, which is subsequently used to learn a good policy in a \emph{target task}.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Non-Linear Reinforcement Learning in Large Action Spaces: Structural Conditions and Sample-efficiency of Posterior Sampling

no code implementations • 15 Mar 2022 • Alekh Agarwal, Tong Zhang

Provably sample-efficient Reinforcement Learning (RL) with rich observations and function approximation has witnessed tremendous recent progress, particularly when the underlying function approximators are linear.

Reinforcement Learning (RL)

Paper
Add Code

Minimax Regret Optimization for Robust Machine Learning under Distribution Shift

no code implementations • 11 Feb 2022 • Alekh Agarwal, Tong Zhang

We instead propose an alternative method called Minimax Regret Optimization (MRO), and show that under suitable conditions this method achieves uniformly low regret across all test distributions.

BIG-bench Machine Learning Learning Theory

Paper
Add Code

Adversarially Trained Actor Critic for Offline Reinforcement Learning

3 code implementations • 5 Feb 2022 • Ching-An Cheng, Tengyang Xie, Nan Jiang, Alekh Agarwal

We propose Adversarially Trained Actor Critic (ATAC), a new model-free algorithm for offline reinforcement learning (RL) under insufficient data coverage, based on the concept of relative pessimism.

Continuous Control D4RL +3

Paper
Code

Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning Approach

1 code implementation • 31 Jan 2022 • Xuezhou Zhang, Yuda Song, Masatoshi Uehara, Mengdi Wang, Alekh Agarwal, Wen Sun

We present BRIEE (Block-structured Representation learning with Interleaved Explore Exploit), an algorithm for efficient reinforcement learning in Markov Decision Processes with block-structured dynamics (i. e., Block MDPs), where rich observations are generated from a set of unknown latent states.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Provable RL with Exogenous Distractors via Multistep Inverse Dynamics

no code implementations • 17 Oct 2021 • Yonathan Efroni, Dipendra Misra, Akshay Krishnamurthy, Alekh Agarwal, John Langford

We initiate the formal study of latent state discovery in the presence of such exogenous noise sources by proposing a new model, the Exogenous Block MDP (EX-BMDP), for rich observation RL.

Reinforcement Learning (RL) Representation Learning

Paper
Add Code

Provably Filtering Exogenous Distractors using Multistep Inverse Dynamics

no code implementations • ICLR 2022 • Yonathan Efroni, Dipendra Misra, Akshay Krishnamurthy, Alekh Agarwal, John Langford

We initiate the formal study of latent state discovery in the presence of such exogenous noise sources by proposing a new model, the Exogenous Block MDP (EX-BMDP), for rich observation RL.

Reinforcement Learning (RL) Representation Learning

Paper
Add Code

Bellman-consistent Pessimism for Offline Reinforcement Learning

no code implementations • NeurIPS 2021 • Tengyang Xie, Ching-An Cheng, Nan Jiang, Paul Mineiro, Alekh Agarwal

The use of pessimism, when reasoning about datasets lacking exhaustive exploration has recently gained prominence in offline reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation

no code implementations • 24 Mar 2021 • Andrea Zanette, Ching-An Cheng, Alekh Agarwal

Policy optimization methods are popular reinforcement learning algorithms, because their incremental and on-policy nature makes them more stable than the value-based counterparts.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Provably Correct Optimization and Exploration with Non-linear Policies

1 code implementation • 22 Mar 2021 • Fei Feng, Wotao Yin, Alekh Agarwal, Lin F. Yang

Policy optimization methods remain a powerful workhorse in empirical Reinforcement Learning (RL), with a focus on neural policies that can easily reason over complex and continuous state and/or action spaces.

Reinforcement Learning (RL)

Paper
Code

Towards a Dimension-Free Understanding of Adaptive Linear Control

no code implementations • 19 Mar 2021 • Juan C. Perdomo, Max Simchowitz, Alekh Agarwal, Peter Bartlett

We study the problem of adaptive control of the linear quadratic regulator for systems in very high, or even infinite dimension.

Paper
Add Code

Model-free Representation Learning and Exploration in Low-rank MDPs

no code implementations • 14 Feb 2021 • Aditya Modi, Jinglin Chen, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal

In this work, we present the first model-free representation learning algorithms for low rank MDPs.

Reinforcement Learning (RL) Representation Learning

Paper
Add Code

Provably Good Batch Off-Policy Reinforcement Learning Without Great Exploration

no code implementations • NeurIPS 2020 • Yao Liu, Adith Swaminathan, Alekh Agarwal, Emma Brunskill

Doing batch RL in a way that yields a reliable new policy in large domains is challenging: a new decision policy may visit states and actions outside the support of the batch data, and function approximation and optimization with limited samples can further increase the potential of learning policies with overly optimistic estimates of their future performance.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Provably Good Batch Reinforcement Learning Without Great Exploration

1 code implementation • 16 Jul 2020 • Yao Liu, Adith Swaminathan, Alekh Agarwal, Emma Brunskill

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning

1 code implementation • NeurIPS 2020 • Alekh Agarwal, Mikael Henaff, Sham Kakade, Wen Sun

Direct policy gradient methods for reinforcement learning are a successful approach for a variety of reasons: they are model free, they directly optimize the performance metric of interest, and they allow for richly parameterized policies.

Policy Gradient Methods Q-Learning

Paper
Code

Policy Improvement via Imitation of Multiple Oracles

no code implementations • NeurIPS 2020 • Ching-An Cheng, Andrey Kolobov, Alekh Agarwal

In this paper, we propose the state-wise maximum of the oracle policies' values as a natural baseline to resolve conflicting advice from multiple oracles.

Imitation Learning

Paper
Add Code

Safe Reinforcement Learning via Curriculum Induction

1 code implementation • NeurIPS 2020 • Matteo Turchetta, Andrey Kolobov, Shital Shah, Andreas Krause, Alekh Agarwal

In safety-critical applications, autonomous agents may need to learn in an environment where mistakes can be very costly.

Autonomous Driving reinforcement-learning +2

Paper
Code

Optimizing Interactive Systems via Data-Driven Objectives

no code implementations • 19 Jun 2020 • Ziming Li, Julia Kiseleva, Alekh Agarwal, Maarten de Rijke, Ryen W. White

Effective optimization is essential for real-world interactive systems to provide a satisfactory user experience in response to changing user behavior.

Paper
Add Code

Reparameterized Variational Divergence Minimization for Stable Imitation

no code implementations • 18 Jun 2020 • Dilip Arumugam, Debadeepta Dey, Alekh Agarwal, Asli Celikyilmaz, Elnaz Nouri, Bill Dolan

While recent state-of-the-art results for adversarial imitation-learning algorithms are encouraging, recent works exploring the imitation learning from observation (ILO) setting, where trajectories \textit{only} contain expert observations, have not been met with the same success.

Continuous Control Imitation Learning

Paper
Add Code

FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs

no code implementations • NeurIPS 2020 • Alekh Agarwal, Sham Kakade, Akshay Krishnamurthy, Wen Sun

In order to deal with the curse of dimensionality in reinforcement learning (RL), it is common practice to make parametric assumptions where values or policies are functions of some low dimensional feature space.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Federated Residual Learning

no code implementations • 28 Mar 2020 • Alekh Agarwal, John Langford, Chen-Yu Wei

We study a new form of federated learning where the clients train personalized local models and make predictions jointly with the server-side shared model.

Federated Learning

Paper
Add Code

Taking a hint: How to leverage loss predictors in contextual bandits?

no code implementations • 4 Mar 2020 • Chen-Yu Wei, Haipeng Luo, Alekh Agarwal

We initiate the study of learning in contextual bandits with the help of loss predictors.

Multi-Armed Bandits

Paper
Add Code

On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift

no code implementations • 1 Aug 2019 • Alekh Agarwal, Sham M. Kakade, Jason D. Lee, Gaurav Mahajan

Policy gradient methods are among the most effective methods in challenging reinforcement learning problems with large state and/or action spaces.

Policy Gradient Methods

Paper
Add Code

Bias Correction of Learned Generative Models using Likelihood-Free Importance Weighting

2 code implementations • NeurIPS 2019 • Aditya Grover, Jiaming Song, Alekh Agarwal, Kenneth Tran, Ashish Kapoor, Eric Horvitz, Stefano Ermon

A standard technique to correct this bias is importance sampling, where samples from the model are weighted by the likelihood ratio under model and true distributions.

Data Augmentation

Paper
Code

Model-Based Reinforcement Learning with a Generative Model is Minimax Optimal

no code implementations • 10 Jun 2019 • Alekh Agarwal, Sham Kakade, Lin F. Yang

In this work, we study the effectiveness of the most natural plug-in approach to model-based planning: we build the maximum likelihood estimate of the transition model in the MDP from observations and then find an optimal policy in this empirical MDP.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds

4 code implementations • ICLR 2020 • Jordan T. Ash, Chicheng Zhang, Akshay Krishnamurthy, John Langford, Alekh Agarwal

We design a new algorithm for batch active learning with deep neural network models.

Active Learning

520

Paper
Code

Fair Regression: Quantitative Definitions and Reduction-based Algorithms

4 code implementations • 30 May 2019 • Alekh Agarwal, Miroslav Dudík, Zhiwei Steven Wu

Our schemes only require access to standard risk minimization algorithms (such as standard classification or least-squares regression) while providing theoretical guarantees on the optimality and fairness of the obtained solutions.

Attribute Fairness +1

1,788

Paper
Code

Metareasoning in Modular Software Systems: On-the-Fly Configuration using Reinforcement Learning with Rich Contextual Representations

no code implementations • 12 May 2019 • Aditya Modi, Debadeepta Dey, Alekh Agarwal, Adith Swaminathan, Besmira Nushi, Sean Andrist, Eric Horvitz

We address the opportunity to maximize the utility of an overall computing system by employing reinforcement learning to guide the configuration of the set of interacting modules that comprise the system.

Decision Making reinforcement-learning +1

Paper
Add Code

Off-Policy Policy Gradient with State Distribution Correction

no code implementations • 17 Apr 2019 • Yao Liu, Adith Swaminathan, Alekh Agarwal, Emma Brunskill

We study the problem of off-policy policy optimization in Markov decision processes, and develop a novel off-policy policy gradient method.

Paper
Add Code

Bias Correction of Learned Generative Models via Likelihood-free Importance Weighting

no code implementations • ICLR Workshop DeepGenStruct 2019 • Aditya Grover, Jiaming Song, Ashish Kapoor, Kenneth Tran, Alekh Agarwal, Eric Horvitz, Stefano Ermon

A standard technique to correct this bias is by importance weighting samples from the model by the likelihood ratio under the model and true distributions.

Data Augmentation

Paper
Add Code

Provably efficient RL with Rich Observations via Latent State Decoding

1 code implementation • 25 Jan 2019 • Simon S. Du, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal, Miroslav Dudík, John Langford

We study the exploration problem in episodic MDPs with rich observations generated from a small number of latent states.

Clustering Q-Learning +1

Paper
Code

Warm-starting Contextual Bandits: Robustly Combining Supervised and Bandit Feedback

1 code implementation • 2 Jan 2019 • Chicheng Zhang, Alekh Agarwal, Hal Daumé III, John Langford, Sahand N. Negahban

We investigate the feasibility of learning from a mix of both fully-labeled supervised data and contextual bandit data.

Multi-Armed Bandits

Paper
Code

Model-based RL in Contextual Decision Processes: PAC bounds and Exponential Improvements over Model-free Approaches

no code implementations • 21 Nov 2018 • Wen Sun, Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford

We study the sample complexity of model-based reinforcement learning (henceforth RL) in general contextual decision processes that require strategic exploration to find a near-optimal policy.

Model-based Reinforcement Learning

Paper
Add Code

A Reductions Approach to Fair Classification

3 code implementations • ICML 2018 • Alekh Agarwal, Alina Beygelzimer, Miroslav Dudík, John Langford, Hanna Wallach

We present a systematic approach for achieving fairness in a binary classification setting.

Binary Classification Classification +2

1,788

Paper
Code

Practical Contextual Bandits with Regression Oracles

no code implementations • ICML 2018 • Dylan J. Foster, Alekh Agarwal, Miroslav Dudík, Haipeng Luo, Robert E. Schapire

A major challenge in contextual bandits is to design general-purpose algorithms that are both practically useful and theoretically well-founded.

General Classification Multi-Armed Bandits +1

Paper
Add Code

On Oracle-Efficient PAC RL with Rich Observations

no code implementations • NeurIPS 2018 • Christoph Dann, Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire

We study the computational tractability of PAC reinforcement learning with rich observations.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Hierarchical Imitation and Reinforcement Learning

no code implementations • ICML 2018 • Hoang M. Le, Nan Jiang, Alekh Agarwal, Miroslav Dudík, Yisong Yue, Hal Daumé III

We study how to effectively leverage expert feedback to learn sequential decision-making policies.

Imitation Learning Montezuma's Revenge +2

Paper
Add Code

Learning Data-Driven Objectives to Optimize Interactive Systems

no code implementations • 17 Feb 2018 • Ziming Li, Julia Kiseleva, Alekh Agarwal, Maarten de Rijke

Effective optimization is essential for interactive systems to provide a satisfactory user experience.

Paper
Add Code

A Contextual Bandit Bake-off

1 code implementation • 12 Feb 2018 • Alberto Bietti, Alekh Agarwal, John Langford

Contextual bandit algorithms are essential for solving many real-world interactive machine learning problems.

Paper
Code

Efficient Contextual Bandits in Non-stationary Worlds

no code implementations • 5 Aug 2017 • Haipeng Luo, Chen-Yu Wei, Alekh Agarwal, John Langford

In this work, we develop several efficient contextual bandit algorithms for non-stationary environments by equipping existing methods for i. i. d.

Multi-Armed Bandits

Paper
Add Code

Active Learning for Cost-Sensitive Classification

no code implementations • ICML 2017 • Akshay Krishnamurthy, Alekh Agarwal, Tzu-Kuo Huang, Hal Daume III, John Langford

We design an active learning algorithm for cost-sensitive multiclass classification: problems where different errors have different costs.

Active Learning Classification +2

Paper
Add Code

Corralling a Band of Bandit Algorithms

1 code implementation • 19 Dec 2016 • Alekh Agarwal, Haipeng Luo, Behnam Neyshabur, Robert E. Schapire

We study the problem of combining multiple bandit algorithms (that is, online learning algorithms with partial feedback) with the goal of creating a master algorithm that performs almost as well as the best base algorithm if it were to be run on its own.

Multi-Armed Bandits

Paper
Code

Optimal and Adaptive Off-policy Evaluation in Contextual Bandits

2 code implementations • ICML 2017 • Yu-Xiang Wang, Alekh Agarwal, Miroslav Dudik

We study the off-policy evaluation problem---estimating the value of a target policy using data collected by another policy---under the contextual bandit model.

Multi-Armed Bandits Off-policy evaluation

3,521

Paper
Code

Contextual Decision Processes with Low Bellman Rank are PAC-Learnable

no code implementations • ICML 2017 • Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire

Our first contribution is a complexity measure, the Bellman rank, that we show enables tractable learning of near-optimal behavior in these processes and is naturally small for many well-studied reinforcement learning settings.

Efficient Exploration reinforcement-learning +1

Paper
Add Code

Making Contextual Decisions with Low Technical Debt

no code implementations • 13 Jun 2016 • Alekh Agarwal, Sarah Bird, Markus Cozowicz, Luong Hoang, John Langford, Stephen Lee, Jiaji Li, Dan Melamed, Gal Oshri, Oswaldo Ribas, Siddhartha Sen, Alex Slivkins

The Decision Service enables all aspects of contextual bandit learning using four system abstractions which connect together in a loop: explore (the decision space), log, learn, and deploy.

Multi-Armed Bandits

Paper
Add Code

Off-policy evaluation for slate recommendation

1 code implementation • NeurIPS 2017 • Adith Swaminathan, Akshay Krishnamurthy, Alekh Agarwal, Miroslav Dudík, John Langford, Damien Jose, Imed Zitouni

This paper studies the evaluation of policies that recommend an ordered set of items (e. g., a ranking) based on some context---a common scenario in web search, ads, and recommendation.

Learning-To-Rank Off-policy evaluation

Paper
Code

Exploratory Gradient Boosting for Reinforcement Learning in Complex Domains

1 code implementation • 14 Mar 2016 • David Abel, Alekh Agarwal, Fernando Diaz, Akshay Krishnamurthy, Robert E. Schapire

We address both of these challenges with two complementary techniques: First, we develop a gradient-boosting style, non-parametric function approximator for learning on $Q$-function residuals.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

PAC Reinforcement Learning with Rich Observations

no code implementations • NeurIPS 2016 • Akshay Krishnamurthy, Alekh Agarwal, John Langford

We prove that the algorithm learns near optimal behavior after a number of episodes that is polynomial in all relevant parameters, logarithmic in the number of policies, and independent of the size of the observation space.

Decision Making Multi-Armed Bandits +2

Paper
Add Code

Efficient Second Order Online Learning by Sketching

no code implementations • NeurIPS 2016 • Haipeng Luo, Alekh Agarwal, Nicolo Cesa-Bianchi, John Langford

We propose Sketched Online Newton (SON), an online second order learning algorithm that enjoys substantially improved regret guarantees for ill-conditioned data.

Paper
Add Code

Fast Convergence of Regularized Learning in Games

no code implementations • NeurIPS 2015 • Vasilis Syrgkanis, Alekh Agarwal, Haipeng Luo, Robert E. Schapire

We show that natural classes of regularized learning algorithms with a form of recency bias achieve faster convergence rates to approximate efficiency and to coarse correlated equilibria in multiplayer normal form games.

Paper
Add Code

Efficient and Parsimonious Agnostic Active Learning

no code implementations • NeurIPS 2015 • Tzu-Kuo Huang, Alekh Agarwal, Daniel J. Hsu, John Langford, Robert E. Schapire

We develop a new active learning algorithm for the streaming setting satisfying three important properties: 1) It provably works for any classifier representation and classification problem including those with severe noise.

Active Learning General Classification

Paper
Add Code

Contextual Semibandits via Supervised Learning Oracles

1 code implementation • NeurIPS 2016 • Akshay Krishnamurthy, Alekh Agarwal, Miroslav Dudik

We study an online decision making problem where on each round a learner chooses a list of items based on some side information, receives a scalar feedback value for each individual item, and a reward that is linearly related to this feedback.

Decision Making Learning-To-Rank

Paper
Code

Learning to Search Better Than Your Teacher

no code implementations • 8 Feb 2015 • Kai-Wei Chang, Akshay Krishnamurthy, Alekh Agarwal, Hal Daumé III, John Langford

Methods for learning to search for structured prediction typically imitate a reference policy, with existing theoretical guarantees demonstrating low regret compared to that reference.

Multi-Armed Bandits Structured Prediction

Paper
Add Code

Scalable Non-linear Learning with Adaptive Polynomial Expansions

no code implementations • NeurIPS 2014 • Alekh Agarwal, Alina Beygelzimer, Daniel J. Hsu, John Langford, Matus J. Telgarsky

Can we effectively learn a nonlinear representation in time comparable to linear learning?

Computational Efficiency

Paper
Add Code

A Lower Bound for the Optimization of Finite Sums

no code implementations • 2 Oct 2014 • Alekh Agarwal, Leon Bottou

This paper presents a lower bound for optimizing a finite sum of $n$ functions, where each function is $L$-smooth and the sum is $\mu$-strongly convex.

Paper
Add Code

Scalable Nonlinear Learning with Adaptive Polynomial Expansions

no code implementations • 2 Oct 2014 • Alekh Agarwal, Alina Beygelzimer, Daniel Hsu, John Langford, Matus Telgarsky

Can we effectively learn a nonlinear representation in time comparable to linear learning?

Computational Efficiency

Paper
Add Code

Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits

1 code implementation • 4 Feb 2014 • Alekh Agarwal, Daniel Hsu, Satyen Kale, John Langford, Lihong Li, Robert E. Schapire

We present a new algorithm for the contextual bandit learning problem, where the learner repeatedly takes one of $K$ actions in response to the observed context, and observes the reward only for that chosen action.

General Classification Multi-Armed Bandits

8,398

Paper
Code

Learning Sparsely Used Overcomplete Dictionaries via Alternating Minimization

no code implementations • 30 Oct 2013 • Alekh Agarwal, Animashree Anandkumar, Prateek Jain, Praneeth Netrapalli

Alternating minimization is a popular heuristic for sparse coding, where the dictionary and the coefficients are estimated in alternate steps, keeping the other fixed.

Paper
Add Code

Para-active learning

no code implementations • 30 Oct 2013 • Alekh Agarwal, Leon Bottou, Miroslav Dudik, John Langford

We leverage the same observation to build a generic strategy for parallelizing learning algorithms.

Active Learning

Paper
Add Code

Least Squares Revisited: Scalable Approaches for Multi-class Prediction

no code implementations • 7 Oct 2013 • Alekh Agarwal, Sham M. Kakade, Nikos Karampatziakis, Le Song, Gregory Valiant

This work provides simple algorithms for multi-class (and multi-label) prediction in settings where both the number of examples n and the data dimension d are relatively large.

Paper
Add Code

A Clustering Approach to Learn Sparsely-Used Overcomplete Dictionaries

no code implementations • 8 Sep 2013 • Alekh Agarwal, Animashree Anandkumar, Praneeth Netrapalli

We consider the problem of learning overcomplete dictionaries in the context of sparse coding, where each sample selects a sparse subset of dictionary elements.

Clustering regression

Paper
Add Code

Stochastic optimization and sparse statistical recovery: Optimal algorithms for high dimensions

no code implementations • NeurIPS 2012 • Alekh Agarwal, Sahand Negahban, Martin J. Wainwright

We develop and analyze stochastic optimization algorithms for problems in which the expected loss is strongly convex, and the optimum is (approximately) sparse.

Stochastic Optimization Vocal Bursts Intensity Prediction

Paper
Add Code

Distributed Delayed Stochastic Optimization

no code implementations • NeurIPS 2011 • Alekh Agarwal, John C. Duchi

We analyze the convergence of gradient-based optimization algorithms whose updates depend on delayed stochastic gradient information.

Distributed Optimization

Paper
Add Code

Stochastic convex optimization with bandit feedback

no code implementations • NeurIPS 2011 • Alekh Agarwal, Dean P. Foster, Daniel J. Hsu, Sham M. Kakade, Alexander Rakhlin

This paper addresses the problem of minimizing a convex, Lipschitz function $f$ over a convex, compact set $X$ under a stochastic bandit feedback model.

Paper
Add Code

A Reliable Effective Terascale Linear Learning System

2 code implementations • 19 Oct 2011 • Alekh Agarwal, Olivier Chapelle, Miroslav Dudik, John Langford

We present a system and a set of techniques for learning linear predictors with convex losses on terascale datasets, with trillions of features, {The number of features here refers to the number of non-zero entries in the data matrix.}

Paper
Code

Fast global convergence rates of gradient methods for high-dimensional statistical recovery

no code implementations • NeurIPS 2010 • Alekh Agarwal, Sahand Negahban, Martin J. Wainwright

Many statistical $M$-estimators are based on convex optimization problems formed by the weighted sum of a loss function with a norm-based regularizer.

Computational Efficiency regression

Paper
Add Code

Distributed Dual Averaging In Networks

no code implementations • NeurIPS 2010 • Alekh Agarwal, Martin J. Wainwright, John C. Duchi

The goal of decentralized optimization over a network is to optimize a global objective formed by a sum of local (possibly nonsmooth) convex functions using only local computation and communication.

Paper
Add Code

Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling

no code implementations • 12 May 2010 • John Duchi, Alekh Agarwal, Martin Wainwright

The goal of decentralized optimization over a network is to optimize a global objective formed by a sum of local (possibly nonsmooth) convex functions using only local computation and communication.

Distributed Optimization

Paper
Add Code

Information-theoretic lower bounds on the oracle complexity of convex optimization

no code implementations • NeurIPS 2009 • Alekh Agarwal, Martin J. Wainwright, Peter L. Bartlett, Pradeep K. Ravikumar

The extensive use of convex optimization in machine learning and statistics makes such an understanding critical to understand fundamental computational limits of learning and estimation.

BIG-bench Machine Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.