Search Results for author: Akshay Krishnamurthy

Found 67 papers, 15 papers with code

Reinforcement Learning with Differential Privacy

no code implementations ICML 2020 Giuseppe Vietri, Borja de Balle Pigem, Steven Wu, Akshay Krishnamurthy

Motivated by high-stakes decision-making domains like personalized medicine where user information is inherently sensitive, we design privacy preserving exploration policies for episodic reinforcement learning (RL).

Decision Making

Efficient and Optimal Algorithms for Contextual Dueling Bandits under Realizability

no code implementations24 Nov 2021 Aadirupa Saha, Akshay Krishnamurthy

We study the $K$-armed contextual dueling bandit problem, a sequential decision making setting in which the learner uses contextual information to make two decisions, but only observes \emph{preference-based feedback} suggesting that one decision was better than the other.

Decision Making

Offline Reinforcement Learning: Fundamental Barriers for Value Function Approximation

no code implementations21 Nov 2021 Dylan J. Foster, Akshay Krishnamurthy, David Simchi-Levi, Yunzong Xu

This led Chen and Jiang (2019) to conjecture that concentrability (the most standard notion of coverage) and realizability (the weakest representation condition) alone are not sufficient for sample-efficient offline RL.

Decision Making Offline RL

Universal and data-adaptive algorithms for model selection in linear contextual bandits

no code implementations8 Nov 2021 Vidya Muthukumar, Akshay Krishnamurthy

In this paper, we introduce new algorithms that a) explore in a data-adaptive manner, and b) provide model selection guarantees of the form $\mathcal{O}(d^{\alpha} T^{1- \alpha})$ with no feature diversity conditions whatsoever, where $d$ denotes the dimension of the linear model and $T$ denotes the total number of rounds.

Model Selection Multi-Armed Bandits

Anti-Concentrated Confidence Bonuses for Scalable Exploration

no code implementations21 Oct 2021 Jordan T. Ash, Cyril Zhang, Surbhi Goel, Akshay Krishnamurthy, Sham Kakade

Intrinsic rewards play a central role in handling the exploration-exploitation trade-off when designing sequential decision-making algorithms, in both foundational theory and state-of-the-art deep reinforcement learning.

Decision Making

Provable RL with Exogenous Distractors via Multistep Inverse Dynamics

no code implementations17 Oct 2021 Yonathan Efroni, Dipendra Misra, Akshay Krishnamurthy, Alekh Agarwal, John Langford

We initiate the formal study of latent state discovery in the presence of such exogenous noise sources by proposing a new model, the Exogenous Block MDP (EX-BMDP), for rich observation RL.

Representation Learning

Sparsity in Partially Controllable Linear Systems

no code implementations12 Oct 2021 Yonathan Efroni, Sham Kakade, Akshay Krishnamurthy, Cyril Zhang

However, in practice, we often encounter systems in which a large set of state variables evolve exogenously and independently of the control inputs; such systems are only \emph{partially controllable}.

Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination

no code implementations NeurIPS 2021 Dylan J. Foster, Akshay Krishnamurthy

A recurring theme in statistical learning, online learning, and beyond is that faster convergence rates are possible for problems with low noise, often quantified by the performance of the best hypothesis; such results are known as first-order or small-loss guarantees.

Decision Making Multi-Armed Bandits

Bayesian decision-making under misspecified priors with applications to meta-learning

no code implementations NeurIPS 2021 Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy, Daniel Hsu, Thodoris Lykouris, Miroslav Dudík, Robert E. Schapire

We prove that the expected reward accrued by Thompson sampling (TS) with a misspecified prior differs by at most $\tilde{\mathcal{O}}(H^2 \epsilon)$ from TS with a well specified prior, where $\epsilon$ is the total-variation distance between priors and $H$ is the learning horizon.

Decision Making Meta-Learning +1

Gone Fishing: Neural Active Learning with Fisher Embeddings

no code implementations NeurIPS 2021 Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Sham Kakade

There is an increasing need for effective active learning algorithms that are compatible with deep neural networks.

Active Learning

Learning the Linear Quadratic Regulator from Nonlinear Observations

no code implementations NeurIPS 2020 Zakaria Mhammedi, Dylan J. Foster, Max Simchowitz, Dipendra Misra, Wen Sun, Akshay Krishnamurthy, Alexander Rakhlin, John Langford

We introduce a new algorithm, RichID, which learns a near-optimal policy for the RichLQR with sample complexity scaling only with the dimension of the latent state space and the capacity of the decoder function class.

Continuous Control

Private Reinforcement Learning with PAC and Regret Guarantees

no code implementations18 Sep 2020 Giuseppe Vietri, Borja Balle, Akshay Krishnamurthy, Zhiwei Steven Wu

Motivated by high-stakes decision-making domains like personalized medicine where user information is inherently sensitive, we design privacy preserving exploration policies for episodic reinforcement learning (RL).

Decision Making

Contrastive learning, multi-view redundancy, and linear models

no code implementations24 Aug 2020 Christopher Tosh, Akshay Krishnamurthy, Daniel Hsu

Self-supervised learning is an empirically successful approach to unsupervised learning based on creating artificial supervised learning problems.

Contrastive Learning Representation Learning +1

Sample-Efficient Reinforcement Learning of Undercomplete POMDPs

no code implementations NeurIPS 2020 Chi Jin, Sham M. Kakade, Akshay Krishnamurthy, Qinghua Liu

Partial observability is a common challenge in many reinforcement learning applications, which requires an agent to maintain memory, infer latent states, and integrate this past information into exploration.

Information Theoretic Regret Bounds for Online Nonlinear Control

1 code implementation NeurIPS 2020 Sham Kakade, Akshay Krishnamurthy, Kendall Lowrey, Motoya Ohnishi, Wen Sun

This work studies the problem of sequential control in an unknown, nonlinear dynamical system, where we model the underlying system dynamics as an unknown function in a known Reproducing Kernel Hilbert Space.

Continuous Control

Open Problem: Model Selection for Contextual Bandits

no code implementations19 Jun 2020 Dylan J. Foster, Akshay Krishnamurthy, Haipeng Luo

In statistical learning, algorithms for model selection allow the learner to adapt to the complexity of the best hypothesis class in a sequence.

Model Selection Multi-Armed Bandits

Provably adaptive reinforcement learning in metric spaces

no code implementations NeurIPS 2020 Tongyi Cao, Akshay Krishnamurthy

We study reinforcement learning in continuous state and action spaces endowed with a metric.

FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs

no code implementations NeurIPS 2020 Alekh Agarwal, Sham Kakade, Akshay Krishnamurthy, Wen Sun

In order to deal with the curse of dimensionality in reinforcement learning (RL), it is common practice to make parametric assumptions where values or policies are functions of some low dimensional feature space.

Latent Variable Models Representation Learning

Contrastive estimation reveals topic posterior information to linear models

no code implementations4 Mar 2020 Christopher Tosh, Akshay Krishnamurthy, Daniel Hsu

Contrastive learning is an approach to representation learning that utilizes naturally occurring similar and dissimilar pairs of data points to find useful embeddings of data.

Classification Contrastive Learning +3

Contextual Search in the Presence of Irrational Agents

no code implementations26 Feb 2020 Akshay Krishnamurthy, Thodoris Lykouris, Chara Podimata, Robert Schapire

We initiate the study of contextual search when some of the agents can behave in ways inconsistent with the underlying behavioral model.

Learning Theory

Adaptive Estimator Selection for Off-Policy Evaluation

1 code implementation ICML 2020 Yi Su, Pavithra Srinath, Akshay Krishnamurthy

We develop a generic data-driven method for estimator selection in off-policy policy evaluation settings.

Multi-Armed Bandits

Reward-Free Exploration for Reinforcement Learning

no code implementations ICML 2020 Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu

We give an efficient algorithm that conducts $\tilde{\mathcal{O}}(S^2A\mathrm{poly}(H)/\epsilon^2)$ episodes of exploration and returns $\epsilon$-suboptimal policies for an arbitrary number of reward functions.

Algebraic and Analytic Approaches for Parameter Learning in Mixture Models

no code implementations19 Jan 2020 Akshay Krishnamurthy, Arya Mazumdar, Andrew Mcgregor, Soumyabrata Pal

Our second approach uses algebraic and combinatorial tools and applies to binomial mixtures with shared trial parameter $N$ and differing success parameters, as well as to mixtures of geometric distributions.

Scalable Hierarchical Clustering with Tree Grafting

1 code implementation31 Dec 2019 Nicholas Monath, Ari Kobren, Akshay Krishnamurthy, Michael Glass, Andrew McCallum

We introduce Grinch, a new algorithm for large-scale, non-greedy hierarchical clustering with general linkage functions that compute arbitrary similarity between two point sets.

Optimism in Reinforcement Learning with Generalized Linear Function Approximation

no code implementations ICLR 2021 Yining Wang, Ruosong Wang, Simon S. Du, Akshay Krishnamurthy

We design a new provably efficient algorithm for episodic reinforcement learning with generalized linear function approximation.

Sample Complexity of Learning Mixture of Sparse Linear Regressions

no code implementations NeurIPS 2019 Akshay Krishnamurthy, Arya Mazumdar, Andrew Mcgregor, Soumyabrata Pal

Ourtechniques are quite different from those in the previous work: for the noiselesscase, we rely on a property of sparse polynomials and for the noisy case, we providenew connections to learning Gaussian mixtures and use ideas from the theory of

Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning

no code implementations ICML 2020 Dipendra Misra, Mikael Henaff, Akshay Krishnamurthy, John Langford

We present an algorithm, HOMER, for exploration and reinforcement learning in rich observation environments that are summarizable by an unknown latent state space.

Representation Learning

Sample Complexity of Learning Mixtures of Sparse Linear Regressions

no code implementations30 Oct 2019 Akshay Krishnamurthy, Arya Mazumdar, Andrew Mcgregor, Soumyabrata Pal

In the problem of learning mixtures of linear regressions, the goal is to learn a collection of signal vectors from a sequence of (possibly noisy) linear measurements, where each measurement is evaluated on an unknown signal drawn uniformly from this collection.

Robust Dynamic Assortment Optimization in the Presence of Outlier Customers

no code implementations9 Oct 2019 Xi Chen, Akshay Krishnamurthy, Yining Wang

The main question investigated in this paper is model mis-specification under the $\varepsilon$-contamination model, which is a fundamental model in robust statistics and machine learning.

Model selection for contextual bandits

1 code implementation NeurIPS 2019 Dylan J. Foster, Akshay Krishnamurthy, Haipeng Luo

We work in the stochastic realizable setting with a sequence of nested linear policy classes of dimension $d_1 < d_2 < \ldots$, where the $m^\star$-th class contains the optimal policy, and we design an algorithm that achieves $\tilde{O}(T^{2/3}d^{1/3}_{m^\star})$ regret with no prior knowledge of the optimal dimension $d_{m^\star}$.

Model Selection Multi-Armed Bandits

Provably efficient RL with Rich Observations via Latent State Decoding

1 code implementation25 Jan 2019 Simon S. Du, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal, Miroslav Dudík, John Langford

We study the exploration problem in episodic MDPs with rich observations generated from a small number of latent states.


Model-based RL in Contextual Decision Processes: PAC bounds and Exponential Improvements over Model-free Approaches

no code implementations21 Nov 2018 Wen Sun, Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford

We study the sample complexity of model-based reinforcement learning (henceforth RL) in general contextual decision processes that require strategic exploration to find a near-optimal policy.

Model-based Reinforcement Learning

Myopic Bayesian Design of Experiments via Posterior Sampling and Probabilistic Programming

1 code implementation25 May 2018 Kirthevasan Kandasamy, Willie Neiswanger, Reed Zhang, Akshay Krishnamurthy, Jeff Schneider, Barnabas Poczos

We design a new myopic strategy for a wide class of sequential design of experiment (DOE) problems, where the goal is to collect data in order to to fulfil a certain problem specific goal.

Multi-Armed Bandits Probabilistic Programming

Semiparametric Contextual Bandits

2 code implementations ICML 2018 Akshay Krishnamurthy, Zhiwei Steven Wu, Vasilis Syrgkanis

This paper studies semiparametric contextual bandits, a generalization of the linear stochastic bandit problem where the reward for an action is modeled as a linear function of known action features confounded by an non-linear action-independent term.

Multi-Armed Bandits

Disagreement-Based Combinatorial Pure Exploration: Sample Complexity Bounds and an Efficient Algorithm

no code implementations21 Nov 2017 Tongyi Cao, Akshay Krishnamurthy

We design new algorithms for the combinatorial pure exploration problem in the multi-arm bandit framework.

Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning

6 code implementations ICLR 2018 Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum

Knowledge bases (KB), both automatically and manually constructed, are often incomplete --- many valid facts can be inferred from the KB by synthesizing existing information.

Asynchronous Parallel Bayesian Optimisation via Thompson Sampling

1 code implementation25 May 2017 Kirthevasan Kandasamy, Akshay Krishnamurthy, Jeff Schneider, Barnabas Poczos

We design and analyse variations of the classical Thompson sampling (TS) procedure for Bayesian optimisation (BO) in settings where function evaluations are expensive, but can be performed in parallel.

Bayesian Optimisation

An Online Hierarchical Algorithm for Extreme Clustering

2 code implementations6 Apr 2017 Ari Kobren, Nicholas Monath, Akshay Krishnamurthy, Andrew McCallum

Many modern clustering methods scale well to a large number of data items, N, but not to a large number of clusters, K. This paper introduces PERCH, a new non-greedy algorithm for online hierarchical clustering that scales to both massive N and K--a problem setting we term extreme clustering.

Active Learning for Cost-Sensitive Classification

no code implementations ICML 2017 Akshay Krishnamurthy, Alekh Agarwal, Tzu-Kuo Huang, Hal Daume III, John Langford

We design an active learning algorithm for cost-sensitive multiclass classification: problems where different errors have different costs.

Active Learning Classification +1

Contextual Decision Processes with Low Bellman Rank are PAC-Learnable

no code implementations ICML 2017 Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire

Our first contribution is a complexity measure, the Bellman rank, that we show enables tractable learning of near-optimal behavior in these processes and is naturally small for many well-studied reinforcement learning settings.

Efficient Exploration

Off-policy evaluation for slate recommendation

1 code implementation NeurIPS 2017 Adith Swaminathan, Akshay Krishnamurthy, Alekh Agarwal, Miroslav Dudík, John Langford, Damien Jose, Imed Zitouni

This paper studies the evaluation of policies that recommend an ordered set of items (e. g., a ranking) based on some context---a common scenario in web search, ads, and recommendation.


Exploratory Gradient Boosting for Reinforcement Learning in Complex Domains

1 code implementation14 Mar 2016 David Abel, Alekh Agarwal, Fernando Diaz, Akshay Krishnamurthy, Robert E. Schapire

We address both of these challenges with two complementary techniques: First, we develop a gradient-boosting style, non-parametric function approximator for learning on $Q$-function residuals.


PAC Reinforcement Learning with Rich Observations

no code implementations NeurIPS 2016 Akshay Krishnamurthy, Alekh Agarwal, John Langford

We prove that the algorithm learns near optimal behavior after a number of episodes that is polynomial in all relevant parameters, logarithmic in the number of policies, and independent of the size of the observation space.

Decision Making Multi-Armed Bandits

Efficient Algorithms for Adversarial Contextual Learning

no code implementations8 Feb 2016 Vasilis Syrgkanis, Akshay Krishnamurthy, Robert E. Schapire

We provide the first oracle efficient sublinear regret algorithms for adversarial versions of the contextual bandit problem.

Combinatorial Optimization

Nonparametric von Mises Estimators for Entropies, Divergences and Mutual Informations

no code implementations NeurIPS 2015 Kirthevasan Kandasamy, Akshay Krishnamurthy, Barnabas Poczos, Larry Wasserman, James M. Robins

We propose and analyse estimators for statistical functionals of one or moredistributions under nonparametric assumptions. Our estimators are derived from the von Mises expansion andare based on the theory of influence functions, which appearin the semiparametric statistics literature. We show that estimators based either on data-splitting or a leave-one-out techniqueenjoy fast rates of convergence and other favorable theoretical properties. We apply this framework to derive estimators for several popular informationtheoretic quantities, and via empirical evaluation, show the advantage of thisapproach over existing estimators.

Minimax Structured Normal Means Inference

no code implementations25 Jun 2015 Akshay Krishnamurthy

We establish nearly matching upper and lower bounds on the minimax probability of error for any structured normal means problem, and we derive an optimality certificate for the maximum likelihood estimator, which can be applied to many instantiations.

Experimental Design

Extreme Compressive Sampling for Covariance Estimation

no code implementations2 Jun 2015 Martin Azizyan, Akshay Krishnamurthy, Aarti Singh

This paper studies the problem of estimating the covariance of a collection of vectors using only highly compressed measurements of each vector.

Contextual Semibandits via Supervised Learning Oracles

1 code implementation NeurIPS 2016 Akshay Krishnamurthy, Alekh Agarwal, Miroslav Dudik

We study an online decision making problem where on each round a learner chooses a list of items based on some side information, receives a scalar feedback value for each individual item, and a reward that is linearly related to this feedback.

Decision Making Learning-To-Rank

Learning to Search Better Than Your Teacher

no code implementations8 Feb 2015 Kai-Wei Chang, Akshay Krishnamurthy, Alekh Agarwal, Hal Daumé III, John Langford

Methods for learning to search for structured prediction typically imitate a reference policy, with existing theoretical guarantees demonstrating low regret compared to that reference.

Multi-Armed Bandits Structured Prediction

On Estimating $L_2^2$ Divergence

no code implementations30 Oct 2014 Akshay Krishnamurthy, Kirthevasan Kandasamy, Barnabas Poczos, Larry Wasserman

We give a comprehensive theoretical characterization of a nonparametric estimator for the $L_2^2$ divergence between two continuous distributions.

On the Power of Adaptivity in Matrix Completion and Approximation

no code implementations14 Jul 2014 Akshay Krishnamurthy, Aarti Singh

We show that adaptive sampling allows one to eliminate standard incoherence assumptions on the matrix row space that are necessary for passive sampling procedures.

Matrix Completion

Subspace Learning from Extremely Compressed Measurements

no code implementations3 Apr 2014 Akshay Krishnamurthy, Martin Azizyan, Aarti Singh

Our theoretical results show that even a constant number of measurements per column suffices to approximate the principal subspace to arbitrary precision, provided that the number of vectors is large.

Nonparametric Estimation of Renyi Divergence and Friends

no code implementations12 Feb 2014 Akshay Krishnamurthy, Kirthevasan Kandasamy, Barnabas Poczos, Larry Wasserman

We consider nonparametric estimation of $L_2$, Renyi-$\alpha$ and Tsallis-$\alpha$ divergences between continuous distributions.

Near-optimal Anomaly Detection in Graphs using Lovasz Extended Scan Statistic

no code implementations NeurIPS 2013 James Sharpnack, Akshay Krishnamurthy, Aarti Singh

The detection of anomalous activity in graphs is a statistical problem that arises in many applications, such as network surveillance, disease outbreak detection, and activity monitoring in social networks.

Anomaly Detection

Recovering Graph-Structured Activations using Adaptive Compressive Measurements

no code implementations1 May 2013 Akshay Krishnamurthy, James Sharpnack, Aarti Singh

We study the localization of a cluster of activated vertices in a graph, from adaptively designed compressive measurements.

Low-Rank Matrix and Tensor Completion via Adaptive Sampling

no code implementations NeurIPS 2013 Akshay Krishnamurthy, Aarti Singh

In the absence of noise, we show that one can exactly recover a $n \times n$ matrix of rank $r$ from merely $\Omega(n r^{3/2}\log(r))$ matrix entries.

Noise Thresholds for Spectral Clustering

no code implementations NeurIPS 2011 Sivaraman Balakrishnan, Min Xu, Akshay Krishnamurthy, Aarti Singh

Although spectral clustering has enjoyed considerable empirical success in machine learning, its theoretical properties are not yet fully developed.

Cannot find the paper you are looking for? You can Submit a new open access paper.