Search Results for author: Akshay Krishnamurthy

Found 84 papers, 20 papers with code

Reinforcement Learning with Differential Privacy

no code implementations • ICML 2020 • Giuseppe Vietri, Borja de Balle Pigem, Steven Wu, Akshay Krishnamurthy

Motivated by high-stakes decision-making domains like personalized medicine where user information is inherently sensitive, we design privacy preserving exploration policies for episodic reinforcement learning (RL).

Decision Making Privacy Preserving +2

Paper
Add Code

Can large language models explore in-context?

no code implementations • 22 Mar 2024 • Akshay Krishnamurthy, Keegan Harris, Dylan J. Foster, Cyril Zhang, Aleksandrs Slivkins

We investigate the extent to which contemporary Large Language Models (LLMs) can engage in exploration, a core capability in reinforcement learning and decision making.

Decision Making

Paper
Add Code

Scalable Online Exploration via Coverability

1 code implementation • 11 Mar 2024 • Philip Amortila, Dylan J. Foster, Akshay Krishnamurthy

We propose exploration objectives -- policy optimization objectives that enable downstream maximization of any reward function -- as a conceptual framework to systematize the study of exploration.

Efficient Exploration Q-Learning +1

Paper
Code

Mitigating Covariate Shift in Misspecified Regression with Applications to Reinforcement Learning

no code implementations • 22 Jan 2024 • Philip Amortila, Tongyi Cao, Akshay Krishnamurthy

A pervasive phenomenon in machine learning applications is distribution shift, where training and deployment conditions for a machine learning model differ.

regression reinforcement-learning

Paper
Add Code

Butterfly Effects of SGD Noise: Error Amplification in Behavior Cloning and Autoregression

no code implementations • 17 Oct 2023 • Adam Block, Dylan J. Foster, Akshay Krishnamurthy, Max Simchowitz, Cyril Zhang

This work studies training instabilities of behavior cloning with deep neural networks.

Continuous Control Text Generation

Paper
Add Code

Oracle-Efficient Pessimism: Offline Policy Optimization in Contextual Bandits

no code implementations • 13 Jun 2023 • Lequn Wang, Akshay Krishnamurthy, Aleksandrs Slivkins

We consider offline policy optimization (OPO) in contextual bandits, where one is given a fixed dataset of logged interactions.

Multi-Armed Bandits

Paper
Add Code

Streaming Active Learning with Deep Neural Networks

2 code implementations • 5 Mar 2023 • Akanksha Saran, Safoora Yousefi, Akshay Krishnamurthy, John Langford, Jordan T. Ash

Active learning is perhaps most naturally posed as an online learning problem.

Active Learning

182

Paper
Code

Learning Hidden Markov Models Using Conditional Samples

no code implementations • 28 Feb 2023 • Sham M. Kakade, Akshay Krishnamurthy, Gaurav Mahajan, Cyril Zhang

In this paper, we depart from this setup and consider an interactive access model, in which the algorithm can query for samples from the conditional distributions of the HMMs.

Time Series Time Series Analysis

Paper
Add Code

Statistical Learning under Heterogeneous Distribution Shift

no code implementations • 27 Feb 2023 • Max Simchowitz, Anurag Ajay, Pulkit Agrawal, Akshay Krishnamurthy

We show that, when the class $F$ is "simpler" than $G$ (measured, e. g., in terms of its metric entropy), our predictor is more resilient to heterogeneous covariate shifts} in which the shift in $\mathbf{x}$ is much greater than that in $\mathbf{y}$.

Paper
Add Code

Transformers Learn Shortcuts to Automata

no code implementations • 19 Oct 2022 • Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang

Algorithmic reasoning requires capabilities which are most naturally understood through recurrent models of computation, like the Turing machine.

Paper
Add Code

Hybrid RL: Using Both Offline and Online Data Can Make RL Efficient

1 code implementation • 13 Oct 2022 • Yuda Song, Yifei Zhou, Ayush Sekhari, J. Andrew Bagnell, Akshay Krishnamurthy, Wen Sun

We consider a hybrid reinforcement learning setting (Hybrid RL), in which an agent has access to an offline dataset and the ability to collect experience via real-world online interaction.

Montezuma's Revenge Q-Learning

Paper
Code

Guaranteed Discovery of Control-Endogenous Latent States with Multi-Step Inverse Models

no code implementations • 17 Jul 2022 • Alex Lamb, Riashat Islam, Yonathan Efroni, Aniket Didolkar, Dipendra Misra, Dylan Foster, Lekan Molu, Rajan Chari, Akshay Krishnamurthy, John Langford

In many sequential decision-making tasks, the agent is not able to model the full complexity of the world, which consists of multitudes of relevant and irrelevant information.

Decision Making

Paper
Add Code

On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL

no code implementations • 21 Jun 2022 • Jinglin Chen, Aditya Modi, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal

We study reward-free reinforcement learning (RL) under general non-linear function approximation, and establish sample efficiency and hardness results under various standard structural assumptions.

Reinforcement Learning (RL)

Paper
Add Code

Sample-Efficient Reinforcement Learning in the Presence of Exogenous Information

no code implementations • 9 Jun 2022 • Yonathan Efroni, Dylan J. Foster, Dipendra Misra, Akshay Krishnamurthy, John Langford

In real-world reinforcement learning applications the learner's observation space is ubiquitously high-dimensional with both relevant and irrelevant information about the task at hand.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

A Complete Characterization of Linear Estimators for Offline Policy Evaluation

no code implementations • 8 Mar 2022 • Juan C. Perdomo, Akshay Krishnamurthy, Peter Bartlett, Sham Kakade

Offline policy evaluation is a fundamental statistical problem in reinforcement learning that involves estimating the value function of some decision-making policy given data collected by a potentially different policy.

Decision Making reinforcement-learning +1

Paper
Add Code

Understanding Contrastive Learning Requires Incorporating Inductive Biases

no code implementations • 28 Feb 2022 • Nikunj Saunshi, Jordan Ash, Surbhi Goel, Dipendra Misra, Cyril Zhang, Sanjeev Arora, Sham Kakade, Akshay Krishnamurthy

Contrastive learning is a popular form of self-supervised learning that encourages augmentations (views) of the same input to have more similar representations compared to augmentations of different inputs.

Contrastive Learning Self-Supervised Learning

Paper
Add Code

Provable Reinforcement Learning with a Short-Term Memory

no code implementations • 8 Feb 2022 • Yonathan Efroni, Chi Jin, Akshay Krishnamurthy, Sobhan Miryoosefi

Real-world sequential decision making problems commonly involve partial observability, which requires the agent to maintain a memory of history in order to infer the latent states, plan and make good decisions.

Decision Making reinforcement-learning +1

Paper
Add Code

Efficient and Optimal Algorithms for Contextual Dueling Bandits under Realizability

no code implementations • 24 Nov 2021 • Aadirupa Saha, Akshay Krishnamurthy

We study the $K$-armed contextual dueling bandit problem, a sequential decision making setting in which the learner uses contextual information to make two decisions, but only observes \emph{preference-based feedback} suggesting that one decision was better than the other.

Decision Making

Paper
Add Code

Offline Reinforcement Learning: Fundamental Barriers for Value Function Approximation

no code implementations • 21 Nov 2021 • Dylan J. Foster, Akshay Krishnamurthy, David Simchi-Levi, Yunzong Xu

This led Chen and Jiang (2019) to conjecture that concentrability (the most standard notion of coverage) and realizability (the weakest representation condition) alone are not sufficient for sample-efficient offline RL.

Decision Making Offline RL +2

Paper
Add Code

Universal and data-adaptive algorithms for model selection in linear contextual bandits

no code implementations • 8 Nov 2021 • Vidya Muthukumar, Akshay Krishnamurthy

In this paper, we introduce new algorithms that a) explore in a data-adaptive manner, and b) provide model selection guarantees of the form $\mathcal{O}(d^{\alpha} T^{1- \alpha})$ with no feature diversity conditions whatsoever, where $d$ denotes the dimension of the linear model and $T$ denotes the total number of rounds.

Model Selection Multi-Armed Bandits

Paper
Add Code

Anti-Concentrated Confidence Bonuses for Scalable Exploration

no code implementations • ICLR 2022 • Jordan T. Ash, Cyril Zhang, Surbhi Goel, Akshay Krishnamurthy, Sham Kakade

Intrinsic rewards play a central role in handling the exploration-exploitation trade-off when designing sequential decision-making algorithms, in both foundational theory and state-of-the-art deep reinforcement learning.

Decision Making reinforcement-learning +1

Paper
Add Code

Provable RL with Exogenous Distractors via Multistep Inverse Dynamics

no code implementations • 17 Oct 2021 • Yonathan Efroni, Dipendra Misra, Akshay Krishnamurthy, Alekh Agarwal, John Langford

We initiate the formal study of latent state discovery in the presence of such exogenous noise sources by proposing a new model, the Exogenous Block MDP (EX-BMDP), for rich observation RL.

Reinforcement Learning (RL) Representation Learning

Paper
Add Code

Sparsity in Partially Controllable Linear Systems

no code implementations • 12 Oct 2021 • Yonathan Efroni, Sham Kakade, Akshay Krishnamurthy, Cyril Zhang

However, in practice, we often encounter systems in which a large set of state variables evolve exogenously and independently of the control inputs; such systems are only partially controllable.

Paper
Add Code

Provably Filtering Exogenous Distractors using Multistep Inverse Dynamics

no code implementations • ICLR 2022 • Yonathan Efroni, Dipendra Misra, Akshay Krishnamurthy, Alekh Agarwal, John Langford

We initiate the formal study of latent state discovery in the presence of such exogenous noise sources by proposing a new model, the Exogenous Block MDP (EX-BMDP), for rich observation RL.

Reinforcement Learning (RL) Representation Learning

Paper
Add Code

Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination

no code implementations • NeurIPS 2021 • Dylan J. Foster, Akshay Krishnamurthy

A recurring theme in statistical learning, online learning, and beyond is that faster convergence rates are possible for problems with low noise, often quantified by the performance of the best hypothesis; such results are known as first-order or small-loss guarantees.

Decision Making Multi-Armed Bandits +1

Paper
Add Code

Bayesian decision-making under misspecified priors with applications to meta-learning

no code implementations • NeurIPS 2021 • Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy, Daniel Hsu, Thodoris Lykouris, Miroslav Dudík, Robert E. Schapire

We prove that the expected reward accrued by Thompson sampling (TS) with a misspecified prior differs by at most $\tilde{\mathcal{O}}(H^2 \epsilon)$ from TS with a well specified prior, where $\epsilon$ is the total-variation distance between priors and $H$ is the learning horizon.

Decision Making Meta-Learning +2

Paper
Add Code

Investigating the Role of Negatives in Contrastive Representation Learning

no code implementations • 18 Jun 2021 • Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Dipendra Misra

We focus on disambiguating the role of one of these parameters: the number of negative examples.

Contrastive Learning Data Augmentation +3

Paper
Add Code

Gone Fishing: Neural Active Learning with Fisher Embeddings

1 code implementation • NeurIPS 2021 • Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Sham Kakade

There is an increasing need for effective active learning algorithms that are compatible with deep neural networks.

Active Learning

182

Paper
Code

Model-free Representation Learning and Exploration in Low-rank MDPs

no code implementations • 14 Feb 2021 • Aditya Modi, Jinglin Chen, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal

In this work, we present the first model-free representation learning algorithms for low rank MDPs.

Reinforcement Learning (RL) Representation Learning

Paper
Add Code

Learning the Linear Quadratic Regulator from Nonlinear Observations

no code implementations • NeurIPS 2020 • Zakaria Mhammedi, Dylan J. Foster, Max Simchowitz, Dipendra Misra, Wen Sun, Akshay Krishnamurthy, Alexander Rakhlin, John Langford

We introduce a new algorithm, RichID, which learns a near-optimal policy for the RichLQR with sample complexity scaling only with the dimension of the latent state space and the capacity of the decoder function class.

Continuous Control

Paper
Add Code

Private Reinforcement Learning with PAC and Regret Guarantees

no code implementations • 18 Sep 2020 • Giuseppe Vietri, Borja Balle, Akshay Krishnamurthy, Zhiwei Steven Wu

Decision Making Privacy Preserving +2

Paper
Add Code

Contrastive learning, multi-view redundancy, and linear models

no code implementations • 24 Aug 2020 • Christopher Tosh, Akshay Krishnamurthy, Daniel Hsu

Self-supervised learning is an empirically successful approach to unsupervised learning based on creating artificial supervised learning problems.

Contrastive Learning Representation Learning +1

Paper
Add Code

Sample-Efficient Reinforcement Learning of Undercomplete POMDPs

no code implementations • NeurIPS 2020 • Chi Jin, Sham M. Kakade, Akshay Krishnamurthy, Qinghua Liu

Partial observability is a common challenge in many reinforcement learning applications, which requires an agent to maintain memory, infer latent states, and integrate this past information into exploration.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Information Theoretic Regret Bounds for Online Nonlinear Control

1 code implementation • NeurIPS 2020 • Sham Kakade, Akshay Krishnamurthy, Kendall Lowrey, Motoya Ohnishi, Wen Sun

This work studies the problem of sequential control in an unknown, nonlinear dynamical system, where we model the underlying system dynamics as an unknown function in a known Reproducing Kernel Hilbert Space.

Continuous Control

Paper
Code

Open Problem: Model Selection for Contextual Bandits

no code implementations • 19 Jun 2020 • Dylan J. Foster, Akshay Krishnamurthy, Haipeng Luo

In statistical learning, algorithms for model selection allow the learner to adapt to the complexity of the best hypothesis class in a sequence.

Model Selection Multi-Armed Bandits

Paper
Add Code

Provably adaptive reinforcement learning in metric spaces

no code implementations • NeurIPS 2020 • Tongyi Cao, Akshay Krishnamurthy

We study reinforcement learning in continuous state and action spaces endowed with a metric.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs

no code implementations • NeurIPS 2020 • Alekh Agarwal, Sham Kakade, Akshay Krishnamurthy, Wen Sun

In order to deal with the curse of dimensionality in reinforcement learning (RL), it is common practice to make parametric assumptions where values or policies are functions of some low dimensional feature space.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Efficient Contextual Bandits with Continuous Actions

1 code implementation • NeurIPS 2020 • Maryam Majzoubi, Chicheng Zhang, Rajan Chari, Akshay Krishnamurthy, John Langford, Aleksandrs Slivkins

We create a computationally tractable algorithm for contextual bandits with continuous actions having unknown structure.

Multi-Armed Bandits

Paper
Code

Contrastive estimation reveals topic posterior information to linear models

no code implementations • 4 Mar 2020 • Christopher Tosh, Akshay Krishnamurthy, Daniel Hsu

Contrastive learning is an approach to representation learning that utilizes naturally occurring similar and dissimilar pairs of data points to find useful embeddings of data.

Classification Contrastive Learning +3

Paper
Add Code

Contextual Search in the Presence of Adversarial Corruptions

no code implementations • 26 Feb 2020 • Akshay Krishnamurthy, Thodoris Lykouris, Chara Podimata, Robert Schapire

We initiate the study of contextual search when some of the agents can behave in ways inconsistent with the underlying response model.

Learning Theory

Paper
Add Code

Adaptive Estimator Selection for Off-Policy Evaluation

1 code implementation • ICML 2020 • Yi Su, Pavithra Srinath, Akshay Krishnamurthy

We develop a generic data-driven method for estimator selection in off-policy policy evaluation settings.

Multi-Armed Bandits Off-policy evaluation +1

Paper
Code

Reward-Free Exploration for Reinforcement Learning

no code implementations • ICML 2020 • Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu

We give an efficient algorithm that conducts $\tilde{\mathcal{O}}(S^2A\mathrm{poly}(H)/\epsilon^2)$ episodes of exploration and returns $\epsilon$-suboptimal policies for an arbitrary number of reward functions.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Algebraic and Analytic Approaches for Parameter Learning in Mixture Models

no code implementations • 19 Jan 2020 • Akshay Krishnamurthy, Arya Mazumdar, Andrew Mcgregor, Soumyabrata Pal

Our second approach uses algebraic and combinatorial tools and applies to binomial mixtures with shared trial parameter $N$ and differing success parameters, as well as to mixtures of geometric distributions.

Paper
Add Code

Scalable Hierarchical Clustering with Tree Grafting

1 code implementation • 31 Dec 2019 • Nicholas Monath, Ari Kobren, Akshay Krishnamurthy, Michael Glass, Andrew McCallum

We introduce Grinch, a new algorithm for large-scale, non-greedy hierarchical clustering with general linkage functions that compute arbitrary similarity between two point sets.

Clustering

Paper
Code

Optimism in Reinforcement Learning with Generalized Linear Function Approximation

no code implementations • ICLR 2021 • Yining Wang, Ruosong Wang, Simon S. Du, Akshay Krishnamurthy

We design a new provably efficient algorithm for episodic reinforcement learning with generalized linear function approximation.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Sample Complexity of Learning Mixture of Sparse Linear Regressions

no code implementations • NeurIPS 2019 • Akshay Krishnamurthy, Arya Mazumdar, Andrew Mcgregor, Soumyabrata Pal

Ourtechniques are quite different from those in the previous work: for the noiselesscase, we rely on a property of sparse polynomials and for the noisy case, we providenew connections to learning Gaussian mixtures and use ideas from the theory of

Open-Ended Question Answering

Paper
Add Code

Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning

no code implementations • ICML 2020 • Dipendra Misra, Mikael Henaff, Akshay Krishnamurthy, John Langford

We present an algorithm, HOMER, for exploration and reinforcement learning in rich observation environments that are summarizable by an unknown latent state space.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Sample Complexity of Learning Mixtures of Sparse Linear Regressions

no code implementations • 30 Oct 2019 • Akshay Krishnamurthy, Arya Mazumdar, Andrew Mcgregor, Soumyabrata Pal

In the problem of learning mixtures of linear regressions, the goal is to learn a collection of signal vectors from a sequence of (possibly noisy) linear measurements, where each measurement is evaluated on an unknown signal drawn uniformly from this collection.

Open-Ended Question Answering

Paper
Add Code

Robust Dynamic Assortment Optimization in the Presence of Outlier Customers

no code implementations • 9 Oct 2019 • Xi Chen, Akshay Krishnamurthy, Yining Wang

We establish both upper and lower bounds on the regret, and show that our policy is optimal up to logarithmic factor in $T$ when the assortment capacity is constant.

Thompson Sampling

Paper
Add Code

Doubly robust off-policy evaluation with shrinkage

no code implementations • ICML 2020 • Yi Su, Maria Dimakopoulou, Akshay Krishnamurthy, Miroslav Dudík

We propose a new framework for designing estimators for off-policy evaluation in contextual bandits.

Model Selection Multi-Armed Bandits +1

Paper
Add Code

Deep Batch Active Learning by Diverse, Uncertain Gradient Lower Bounds

4 code implementations • ICLR 2020 • Jordan T. Ash, Chicheng Zhang, Akshay Krishnamurthy, John Langford, Alekh Agarwal

We design a new algorithm for batch active learning with deep neural network models.

Active Learning

519

Paper
Code

Model selection for contextual bandits

1 code implementation • NeurIPS 2019 • Dylan J. Foster, Akshay Krishnamurthy, Haipeng Luo

We work in the stochastic realizable setting with a sequence of nested linear policy classes of dimension $d_1 < d_2 < \ldots$, where the $m^\star$-th class contains the optimal policy, and we design an algorithm that achieves $\tilde{O}(T^{2/3}d^{1/3}_{m^\star})$ regret with no prior knowledge of the optimal dimension $d_{m^\star}$.

Model Selection Multi-Armed Bandits

Paper
Code

Contextual Bandits with Continuous Actions: Smoothing, Zooming, and Adapting

no code implementations • 5 Feb 2019 • Akshay Krishnamurthy, John Langford, Aleksandrs Slivkins, Chicheng Zhang

We study contextual bandit learning with an abstract policy class and continuous action space.

Multi-Armed Bandits

Paper
Add Code

Provably efficient RL with Rich Observations via Latent State Decoding

1 code implementation • 25 Jan 2019 • Simon S. Du, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal, Miroslav Dudík, John Langford

We study the exploration problem in episodic MDPs with rich observations generated from a small number of latent states.

Clustering Q-Learning +1

Paper
Code

Model-based RL in Contextual Decision Processes: PAC bounds and Exponential Improvements over Model-free Approaches

no code implementations • 21 Nov 2018 • Wen Sun, Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford

We study the sample complexity of model-based reinforcement learning (henceforth RL) in general contextual decision processes that require strategic exploration to find a near-optimal policy.

Model-based Reinforcement Learning

Paper
Add Code

Contextual bandits with surrogate losses: Margin bounds and efficient algorithms

no code implementations • NeurIPS 2018 • Dylan J. Foster, Akshay Krishnamurthy

We use surrogate losses to obtain several new regret bounds and new algorithms for contextual bandit learning.

Multi-Armed Bandits regression

Paper
Add Code

Myopic Bayesian Design of Experiments via Posterior Sampling and Probabilistic Programming

1 code implementation • 25 May 2018 • Kirthevasan Kandasamy, Willie Neiswanger, Reed Zhang, Akshay Krishnamurthy, Jeff Schneider, Barnabas Poczos

We design a new myopic strategy for a wide class of sequential design of experiment (DOE) problems, where the goal is to collect data in order to to fulfil a certain problem specific goal.

Multi-Armed Bandits Probabilistic Programming +2

Paper
Code

Semiparametric Contextual Bandits

2 code implementations • ICML 2018 • Akshay Krishnamurthy, Zhiwei Steven Wu, Vasilis Syrgkanis

This paper studies semiparametric contextual bandits, a generalization of the linear stochastic bandit problem where the reward for an action is modeled as a linear function of known action features confounded by an non-linear action-independent term.

Multi-Armed Bandits

Paper
Code

On Oracle-Efficient PAC RL with Rich Observations

no code implementations • NeurIPS 2018 • Christoph Dann, Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire

We study the computational tractability of PAC reinforcement learning with rich observations.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Disagreement-Based Combinatorial Pure Exploration: Sample Complexity Bounds and an Efficient Algorithm

no code implementations • 21 Nov 2017 • Tongyi Cao, Akshay Krishnamurthy

We design new algorithms for the combinatorial pure exploration problem in the multi-arm bandit framework.

Paper
Add Code

Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning

7 code implementations • ICLR 2018 • Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum

Knowledge bases (KB), both automatically and manually constructed, are often incomplete --- many valid facts can be inferred from the KB by synthesizing existing information.

Navigate Relation +1

309

Paper
Code

Asynchronous Parallel Bayesian Optimisation via Thompson Sampling

1 code implementation • 25 May 2017 • Kirthevasan Kandasamy, Akshay Krishnamurthy, Jeff Schneider, Barnabas Poczos

We design and analyse variations of the classical Thompson sampling (TS) procedure for Bayesian optimisation (BO) in settings where function evaluations are expensive, but can be performed in parallel.

Bayesian Optimisation Thompson Sampling

Paper
Code

An Online Hierarchical Algorithm for Extreme Clustering

2 code implementations • 6 Apr 2017 • Ari Kobren, Nicholas Monath, Akshay Krishnamurthy, Andrew McCallum

Many modern clustering methods scale well to a large number of data items, N, but not to a large number of clusters, K. This paper introduces PERCH, a new non-greedy algorithm for online hierarchical clustering that scales to both massive N and K--a problem setting we term extreme clustering.

Clustering

Paper
Code

Active Learning for Cost-Sensitive Classification

no code implementations • ICML 2017 • Akshay Krishnamurthy, Alekh Agarwal, Tzu-Kuo Huang, Hal Daume III, John Langford

We design an active learning algorithm for cost-sensitive multiclass classification: problems where different errors have different costs.

Active Learning Classification +2

Paper
Add Code

Contextual Decision Processes with Low Bellman Rank are PAC-Learnable

no code implementations • ICML 2017 • Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire

Our first contribution is a complexity measure, the Bellman rank, that we show enables tractable learning of near-optimal behavior in these processes and is naturally small for many well-studied reinforcement learning settings.

Efficient Exploration reinforcement-learning +1

Paper
Add Code

Improved Regret Bounds for Oracle-Based Adversarial Contextual Bandits

no code implementations • NeurIPS 2016 • Vasilis Syrgkanis, Haipeng Luo, Akshay Krishnamurthy, Robert E. Schapire

We give an oracle-based algorithm for the adversarial contextual bandit problem, where either contexts are drawn i. i. d.

Multi-Armed Bandits

Paper
Add Code

Off-policy evaluation for slate recommendation

1 code implementation • NeurIPS 2017 • Adith Swaminathan, Akshay Krishnamurthy, Alekh Agarwal, Miroslav Dudík, John Langford, Damien Jose, Imed Zitouni

This paper studies the evaluation of policies that recommend an ordered set of items (e. g., a ranking) based on some context---a common scenario in web search, ads, and recommendation.

Learning-To-Rank Off-policy evaluation

Paper
Code

Exploratory Gradient Boosting for Reinforcement Learning in Complex Domains

1 code implementation • 14 Mar 2016 • David Abel, Alekh Agarwal, Fernando Diaz, Akshay Krishnamurthy, Robert E. Schapire

We address both of these challenges with two complementary techniques: First, we develop a gradient-boosting style, non-parametric function approximator for learning on $Q$-function residuals.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

PAC Reinforcement Learning with Rich Observations

no code implementations • NeurIPS 2016 • Akshay Krishnamurthy, Alekh Agarwal, John Langford

We prove that the algorithm learns near optimal behavior after a number of episodes that is polynomial in all relevant parameters, logarithmic in the number of policies, and independent of the size of the observation space.

Decision Making Multi-Armed Bandits +2

Paper
Add Code

Efficient Algorithms for Adversarial Contextual Learning

no code implementations • 8 Feb 2016 • Vasilis Syrgkanis, Akshay Krishnamurthy, Robert E. Schapire

We provide the first oracle efficient sublinear regret algorithms for adversarial versions of the contextual bandit problem.

Combinatorial Optimization

Paper
Add Code

Nonparametric von Mises Estimators for Entropies, Divergences and Mutual Informations

no code implementations • NeurIPS 2015 • Kirthevasan Kandasamy, Akshay Krishnamurthy, Barnabas Poczos, Larry Wasserman, James M. Robins

We propose and analyse estimators for statistical functionals of one or moredistributions under nonparametric assumptions. Our estimators are derived from the von Mises expansion andare based on the theory of influence functions, which appearin the semiparametric statistics literature. We show that estimators based either on data-splitting or a leave-one-out techniqueenjoy fast rates of convergence and other favorable theoretical properties. We apply this framework to derive estimators for several popular informationtheoretic quantities, and via empirical evaluation, show the advantage of thisapproach over existing estimators.

Paper
Add Code

Minimax Structured Normal Means Inference

no code implementations • 25 Jun 2015 • Akshay Krishnamurthy

We establish nearly matching upper and lower bounds on the minimax probability of error for any structured normal means problem, and we derive an optimality certificate for the maximum likelihood estimator, which can be applied to many instantiations.

Experimental Design

Paper
Add Code

Extreme Compressive Sampling for Covariance Estimation

no code implementations • 2 Jun 2015 • Martin Azizyan, Akshay Krishnamurthy, Aarti Singh

This paper studies the problem of estimating the covariance of a collection of vectors using only highly compressed measurements of each vector.

Paper
Add Code

Contextual Semibandits via Supervised Learning Oracles

1 code implementation • NeurIPS 2016 • Akshay Krishnamurthy, Alekh Agarwal, Miroslav Dudik

We study an online decision making problem where on each round a learner chooses a list of items based on some side information, receives a scalar feedback value for each individual item, and a reward that is linearly related to this feedback.

Decision Making Learning-To-Rank

Paper
Code

Learning to Search Better Than Your Teacher

no code implementations • 8 Feb 2015 • Kai-Wei Chang, Akshay Krishnamurthy, Alekh Agarwal, Hal Daumé III, John Langford

Methods for learning to search for structured prediction typically imitate a reference policy, with existing theoretical guarantees demonstrating low regret compared to that reference.

Multi-Armed Bandits Structured Prediction

Paper
Add Code

Influence Functions for Machine Learning: Nonparametric Estimators for Entropies, Divergences and Mutual Informations

2 code implementations • 17 Nov 2014 • Kirthevasan Kandasamy, Akshay Krishnamurthy, Barnabas Poczos, Larry Wasserman, James M. Robins

We propose and analyze estimators for statistical functionals of one or more distributions under nonparametric assumptions.

BIG-bench Machine Learning

Paper
Code

On Estimating $L_2^2$ Divergence

no code implementations • 30 Oct 2014 • Akshay Krishnamurthy, Kirthevasan Kandasamy, Barnabas Poczos, Larry Wasserman

We give a comprehensive theoretical characterization of a nonparametric estimator for the $L_2^2$ divergence between two continuous distributions.

Paper
Add Code

On the Power of Adaptivity in Matrix Completion and Approximation

no code implementations • 14 Jul 2014 • Akshay Krishnamurthy, Aarti Singh

We show that adaptive sampling allows one to eliminate standard incoherence assumptions on the matrix row space that are necessary for passive sampling procedures.

Matrix Completion

Paper
Add Code

Subspace Learning from Extremely Compressed Measurements

no code implementations • 3 Apr 2014 • Akshay Krishnamurthy, Martin Azizyan, Aarti Singh

Our theoretical results show that even a constant number of measurements per column suffices to approximate the principal subspace to arbitrary precision, provided that the number of vectors is large.

Paper
Add Code

Nonparametric Estimation of Renyi Divergence and Friends

no code implementations • 12 Feb 2014 • Akshay Krishnamurthy, Kirthevasan Kandasamy, Barnabas Poczos, Larry Wasserman

We consider nonparametric estimation of $L_2$, Renyi-$\alpha$ and Tsallis-$\alpha$ divergences between continuous distributions.

Paper
Add Code

Near-optimal Anomaly Detection in Graphs using Lovasz Extended Scan Statistic

no code implementations • NeurIPS 2013 • James Sharpnack, Akshay Krishnamurthy, Aarti Singh

The detection of anomalous activity in graphs is a statistical problem that arises in many applications, such as network surveillance, disease outbreak detection, and activity monitoring in social networks.

Anomaly Detection

Paper
Add Code

Recovering Graph-Structured Activations using Adaptive Compressive Measurements

no code implementations • 1 May 2013 • Akshay Krishnamurthy, James Sharpnack, Aarti Singh

We study the localization of a cluster of activated vertices in a graph, from adaptively designed compressive measurements.

Paper
Add Code

Low-Rank Matrix and Tensor Completion via Adaptive Sampling

no code implementations • NeurIPS 2013 • Akshay Krishnamurthy, Aarti Singh

In the absence of noise, we show that one can exactly recover a $n \times n$ matrix of rank $r$ from merely $\Omega(n r^{3/2}\log(r))$ matrix entries.

Paper
Add Code

Noise Thresholds for Spectral Clustering

no code implementations • NeurIPS 2011 • Sivaraman Balakrishnan, Min Xu, Akshay Krishnamurthy, Aarti Singh

Although spectral clustering has enjoyed considerable empirical success in machine learning, its theoretical properties are not yet fully developed.

Clustering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.