You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

no code implementations • ICML 2020 • Giuseppe Vietri, Borja de Balle Pigem, Steven Wu, Akshay Krishnamurthy

Motivated by high-stakes decision-making domains like personalized medicine where user information is inherently sensitive, we design privacy preserving exploration policies for episodic reinforcement learning (RL).

no code implementations • 1 Jun 2023 • Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang

We hypothesize that attention glitches account for (some of) the closed-domain hallucinations in natural LLMs.

1 code implementation • 5 Mar 2023 • Akanksha Saran, Safoora Yousefi, Akshay Krishnamurthy, John Langford, Jordan T. Ash

Active learning is perhaps most naturally posed as an online learning problem.

no code implementations • 28 Feb 2023 • Sham M. Kakade, Akshay Krishnamurthy, Gaurav Mahajan, Cyril Zhang

In this paper, we depart from this setup and consider an interactive access model, in which the algorithm can query for samples from the conditional distributions of the HMMs.

no code implementations • 27 Feb 2023 • Max Simchowitz, Anurag Ajay, Pulkit Agrawal, Akshay Krishnamurthy

We show that, when the class $\mathcal{F}$ is "simpler" than $\mathcal{G}$ (measured, e. g., in terms of its metric entropy), our predictor is more resilient to \emph{heterogenous covariate shifts} in which the shift in $\mathbf{x}$ is much greater than that in $\mathbf{y}$.

no code implementations • 19 Oct 2022 • Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang

Algorithmic reasoning requires capabilities which are most naturally understood through recurrent models of computation, like the Turing machine.

1 code implementation • 13 Oct 2022 • Yuda Song, Yifei Zhou, Ayush Sekhari, J. Andrew Bagnell, Akshay Krishnamurthy, Wen Sun

We consider a hybrid reinforcement learning setting (Hybrid RL), in which an agent has access to an offline dataset and the ability to collect experience via real-world online interaction.

no code implementations • 17 Jul 2022 • Alex Lamb, Riashat Islam, Yonathan Efroni, Aniket Didolkar, Dipendra Misra, Dylan Foster, Lekan Molu, Rajan Chari, Akshay Krishnamurthy, John Langford

In many sequential decision-making tasks, the agent is not able to model the full complexity of the world, which consists of multitudes of relevant and irrelevant information.

no code implementations • 21 Jun 2022 • Jinglin Chen, Aditya Modi, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal

We study reward-free reinforcement learning (RL) under general non-linear function approximation, and establish sample efficiency and hardness results under various standard structural assumptions.

no code implementations • 9 Jun 2022 • Yonathan Efroni, Dylan J. Foster, Dipendra Misra, Akshay Krishnamurthy, John Langford

In real-world reinforcement learning applications the learner's observation space is ubiquitously high-dimensional with both relevant and irrelevant information about the task at hand.

no code implementations • 8 Mar 2022 • Juan C. Perdomo, Akshay Krishnamurthy, Peter Bartlett, Sham Kakade

Offline policy evaluation is a fundamental statistical problem in reinforcement learning that involves estimating the value function of some decision-making policy given data collected by a potentially different policy.

no code implementations • 28 Feb 2022 • Nikunj Saunshi, Jordan Ash, Surbhi Goel, Dipendra Misra, Cyril Zhang, Sanjeev Arora, Sham Kakade, Akshay Krishnamurthy

Contrastive learning is a popular form of self-supervised learning that encourages augmentations (views) of the same input to have more similar representations compared to augmentations of different inputs.

no code implementations • 8 Feb 2022 • Yonathan Efroni, Chi Jin, Akshay Krishnamurthy, Sobhan Miryoosefi

Real-world sequential decision making problems commonly involve partial observability, which requires the agent to maintain a memory of history in order to infer the latent states, plan and make good decisions.

no code implementations • 24 Nov 2021 • Aadirupa Saha, Akshay Krishnamurthy

We study the $K$-armed contextual dueling bandit problem, a sequential decision making setting in which the learner uses contextual information to make two decisions, but only observes \emph{preference-based feedback} suggesting that one decision was better than the other.

no code implementations • 21 Nov 2021 • Dylan J. Foster, Akshay Krishnamurthy, David Simchi-Levi, Yunzong Xu

This led Chen and Jiang (2019) to conjecture that concentrability (the most standard notion of coverage) and realizability (the weakest representation condition) alone are not sufficient for sample-efficient offline RL.

no code implementations • 8 Nov 2021 • Vidya Muthukumar, Akshay Krishnamurthy

In this paper, we introduce new algorithms that a) explore in a data-adaptive manner, and b) provide model selection guarantees of the form $\mathcal{O}(d^{\alpha} T^{1- \alpha})$ with no feature diversity conditions whatsoever, where $d$ denotes the dimension of the linear model and $T$ denotes the total number of rounds.

no code implementations • ICLR 2022 • Jordan T. Ash, Cyril Zhang, Surbhi Goel, Akshay Krishnamurthy, Sham Kakade

Intrinsic rewards play a central role in handling the exploration-exploitation trade-off when designing sequential decision-making algorithms, in both foundational theory and state-of-the-art deep reinforcement learning.

no code implementations • 17 Oct 2021 • Yonathan Efroni, Dipendra Misra, Akshay Krishnamurthy, Alekh Agarwal, John Langford

We initiate the formal study of latent state discovery in the presence of such exogenous noise sources by proposing a new model, the Exogenous Block MDP (EX-BMDP), for rich observation RL.

no code implementations • 12 Oct 2021 • Yonathan Efroni, Sham Kakade, Akshay Krishnamurthy, Cyril Zhang

However, in practice, we often encounter systems in which a large set of state variables evolve exogenously and independently of the control inputs; such systems are only partially controllable.

no code implementations • ICLR 2022 • Yonathan Efroni, Dipendra Misra, Akshay Krishnamurthy, Alekh Agarwal, John Langford

We initiate the formal study of latent state discovery in the presence of such exogenous noise sources by proposing a new model, the Exogenous Block MDP (EX-BMDP), for rich observation RL.

no code implementations • NeurIPS 2021 • Dylan J. Foster, Akshay Krishnamurthy

A recurring theme in statistical learning, online learning, and beyond is that faster convergence rates are possible for problems with low noise, often quantified by the performance of the best hypothesis; such results are known as first-order or small-loss guarantees.

no code implementations • NeurIPS 2021 • Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy, Daniel Hsu, Thodoris Lykouris, Miroslav Dudík, Robert E. Schapire

We prove that the expected reward accrued by Thompson sampling (TS) with a misspecified prior differs by at most $\tilde{\mathcal{O}}(H^2 \epsilon)$ from TS with a well specified prior, where $\epsilon$ is the total-variation distance between priors and $H$ is the learning horizon.

no code implementations • 18 Jun 2021 • Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Dipendra Misra

We focus on disambiguating the role of one of these parameters: the number of negative examples.

1 code implementation • NeurIPS 2021 • Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Sham Kakade

There is an increasing need for effective active learning algorithms that are compatible with deep neural networks.

no code implementations • 14 Feb 2021 • Aditya Modi, Jinglin Chen, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal

In this work, we present the first model-free representation learning algorithms for low rank MDPs.

no code implementations • NeurIPS 2020 • Zakaria Mhammedi, Dylan J. Foster, Max Simchowitz, Dipendra Misra, Wen Sun, Akshay Krishnamurthy, Alexander Rakhlin, John Langford

We introduce a new algorithm, RichID, which learns a near-optimal policy for the RichLQR with sample complexity scaling only with the dimension of the latent state space and the capacity of the decoder function class.

no code implementations • 18 Sep 2020 • Giuseppe Vietri, Borja Balle, Akshay Krishnamurthy, Zhiwei Steven Wu

Motivated by high-stakes decision-making domains like personalized medicine where user information is inherently sensitive, we design privacy preserving exploration policies for episodic reinforcement learning (RL).

no code implementations • 24 Aug 2020 • Christopher Tosh, Akshay Krishnamurthy, Daniel Hsu

Self-supervised learning is an empirically successful approach to unsupervised learning based on creating artificial supervised learning problems.

1 code implementation • NeurIPS 2020 • Sham Kakade, Akshay Krishnamurthy, Kendall Lowrey, Motoya Ohnishi, Wen Sun

This work studies the problem of sequential control in an unknown, nonlinear dynamical system, where we model the underlying system dynamics as an unknown function in a known Reproducing Kernel Hilbert Space.

no code implementations • NeurIPS 2020 • Chi Jin, Sham M. Kakade, Akshay Krishnamurthy, Qinghua Liu

Partial observability is a common challenge in many reinforcement learning applications, which requires an agent to maintain memory, infer latent states, and integrate this past information into exploration.

no code implementations • 19 Jun 2020 • Dylan J. Foster, Akshay Krishnamurthy, Haipeng Luo

In statistical learning, algorithms for model selection allow the learner to adapt to the complexity of the best hypothesis class in a sequence.

no code implementations • NeurIPS 2020 • Alekh Agarwal, Sham Kakade, Akshay Krishnamurthy, Wen Sun

In order to deal with the curse of dimensionality in reinforcement learning (RL), it is common practice to make parametric assumptions where values or policies are functions of some low dimensional feature space.

no code implementations • NeurIPS 2020 • Tongyi Cao, Akshay Krishnamurthy

We study reinforcement learning in continuous state and action spaces endowed with a metric.

1 code implementation • NeurIPS 2020 • Maryam Majzoubi, Chicheng Zhang, Rajan Chari, Akshay Krishnamurthy, John Langford, Aleksandrs Slivkins

We create a computationally tractable algorithm for contextual bandits with continuous actions having unknown structure.

no code implementations • 4 Mar 2020 • Christopher Tosh, Akshay Krishnamurthy, Daniel Hsu

Contrastive learning is an approach to representation learning that utilizes naturally occurring similar and dissimilar pairs of data points to find useful embeddings of data.

no code implementations • 26 Feb 2020 • Akshay Krishnamurthy, Thodoris Lykouris, Chara Podimata, Robert Schapire

We initiate the study of contextual search when some of the agents can behave in ways inconsistent with the underlying response model.

1 code implementation • ICML 2020 • Yi Su, Pavithra Srinath, Akshay Krishnamurthy

We develop a generic data-driven method for estimator selection in off-policy policy evaluation settings.

no code implementations • ICML 2020 • Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu

We give an efficient algorithm that conducts $\tilde{\mathcal{O}}(S^2A\mathrm{poly}(H)/\epsilon^2)$ episodes of exploration and returns $\epsilon$-suboptimal policies for an arbitrary number of reward functions.

no code implementations • 19 Jan 2020 • Akshay Krishnamurthy, Arya Mazumdar, Andrew Mcgregor, Soumyabrata Pal

Our second approach uses algebraic and combinatorial tools and applies to binomial mixtures with shared trial parameter $N$ and differing success parameters, as well as to mixtures of geometric distributions.

1 code implementation • 31 Dec 2019 • Nicholas Monath, Ari Kobren, Akshay Krishnamurthy, Michael Glass, Andrew McCallum

We introduce Grinch, a new algorithm for large-scale, non-greedy hierarchical clustering with general linkage functions that compute arbitrary similarity between two point sets.

no code implementations • ICLR 2021 • Yining Wang, Ruosong Wang, Simon S. Du, Akshay Krishnamurthy

We design a new provably efficient algorithm for episodic reinforcement learning with generalized linear function approximation.

no code implementations • NeurIPS 2019 • Akshay Krishnamurthy, Arya Mazumdar, Andrew Mcgregor, Soumyabrata Pal

Ourtechniques are quite different from those in the previous work: for the noiselesscase, we rely on a property of sparse polynomials and for the noisy case, we providenew connections to learning Gaussian mixtures and use ideas from the theory of

no code implementations • ICML 2020 • Dipendra Misra, Mikael Henaff, Akshay Krishnamurthy, John Langford

We present an algorithm, HOMER, for exploration and reinforcement learning in rich observation environments that are summarizable by an unknown latent state space.

no code implementations • 30 Oct 2019 • Akshay Krishnamurthy, Arya Mazumdar, Andrew Mcgregor, Soumyabrata Pal

In the problem of learning mixtures of linear regressions, the goal is to learn a collection of signal vectors from a sequence of (possibly noisy) linear measurements, where each measurement is evaluated on an unknown signal drawn uniformly from this collection.

no code implementations • 9 Oct 2019 • Xi Chen, Akshay Krishnamurthy, Yining Wang

We establish both upper and lower bounds on the regret, and show that our policy is optimal up to logarithmic factor in $T$ when the assortment capacity is constant.

no code implementations • ICML 2020 • Yi Su, Maria Dimakopoulou, Akshay Krishnamurthy, Miroslav Dudík

We propose a new framework for designing estimators for off-policy evaluation in contextual bandits.

4 code implementations • ICLR 2020 • Jordan T. Ash, Chicheng Zhang, Akshay Krishnamurthy, John Langford, Alekh Agarwal

We design a new algorithm for batch active learning with deep neural network models.

1 code implementation • NeurIPS 2019 • Dylan J. Foster, Akshay Krishnamurthy, Haipeng Luo

We work in the stochastic realizable setting with a sequence of nested linear policy classes of dimension $d_1 < d_2 < \ldots$, where the $m^\star$-th class contains the optimal policy, and we design an algorithm that achieves $\tilde{O}(T^{2/3}d^{1/3}_{m^\star})$ regret with no prior knowledge of the optimal dimension $d_{m^\star}$.

no code implementations • 5 Feb 2019 • Akshay Krishnamurthy, John Langford, Aleksandrs Slivkins, Chicheng Zhang

We study contextual bandit learning with an abstract policy class and continuous action space.

1 code implementation • 25 Jan 2019 • Simon S. Du, Akshay Krishnamurthy, Nan Jiang, Alekh Agarwal, Miroslav Dudík, John Langford

We study the exploration problem in episodic MDPs with rich observations generated from a small number of latent states.

no code implementations • 21 Nov 2018 • Wen Sun, Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford

We study the sample complexity of model-based reinforcement learning (henceforth RL) in general contextual decision processes that require strategic exploration to find a near-optimal policy.

no code implementations • NeurIPS 2018 • Dylan J. Foster, Akshay Krishnamurthy

We use surrogate losses to obtain several new regret bounds and new algorithms for contextual bandit learning.

1 code implementation • 25 May 2018 • Kirthevasan Kandasamy, Willie Neiswanger, Reed Zhang, Akshay Krishnamurthy, Jeff Schneider, Barnabas Poczos

We design a new myopic strategy for a wide class of sequential design of experiment (DOE) problems, where the goal is to collect data in order to to fulfil a certain problem specific goal.

2 code implementations • ICML 2018 • Akshay Krishnamurthy, Zhiwei Steven Wu, Vasilis Syrgkanis

This paper studies semiparametric contextual bandits, a generalization of the linear stochastic bandit problem where the reward for an action is modeled as a linear function of known action features confounded by an non-linear action-independent term.

no code implementations • NeurIPS 2018 • Christoph Dann, Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire

We study the computational tractability of PAC reinforcement learning with rich observations.

no code implementations • 21 Nov 2017 • Tongyi Cao, Akshay Krishnamurthy

We design new algorithms for the combinatorial pure exploration problem in the multi-arm bandit framework.

7 code implementations • ICLR 2018 • Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Luke Vilnis, Ishan Durugkar, Akshay Krishnamurthy, Alex Smola, Andrew McCallum

Knowledge bases (KB), both automatically and manually constructed, are often incomplete --- many valid facts can be inferred from the KB by synthesizing existing information.

1 code implementation • 25 May 2017 • Kirthevasan Kandasamy, Akshay Krishnamurthy, Jeff Schneider, Barnabas Poczos

We design and analyse variations of the classical Thompson sampling (TS) procedure for Bayesian optimisation (BO) in settings where function evaluations are expensive, but can be performed in parallel.

2 code implementations • 6 Apr 2017 • Ari Kobren, Nicholas Monath, Akshay Krishnamurthy, Andrew McCallum

Many modern clustering methods scale well to a large number of data items, N, but not to a large number of clusters, K. This paper introduces PERCH, a new non-greedy algorithm for online hierarchical clustering that scales to both massive N and K--a problem setting we term extreme clustering.

no code implementations • ICML 2017 • Akshay Krishnamurthy, Alekh Agarwal, Tzu-Kuo Huang, Hal Daume III, John Langford

We design an active learning algorithm for cost-sensitive multiclass classification: problems where different errors have different costs.

no code implementations • ICML 2017 • Nan Jiang, Akshay Krishnamurthy, Alekh Agarwal, John Langford, Robert E. Schapire

Our first contribution is a complexity measure, the Bellman rank, that we show enables tractable learning of near-optimal behavior in these processes and is naturally small for many well-studied reinforcement learning settings.

no code implementations • NeurIPS 2016 • Vasilis Syrgkanis, Haipeng Luo, Akshay Krishnamurthy, Robert E. Schapire

We give an oracle-based algorithm for the adversarial contextual bandit problem, where either contexts are drawn i. i. d.

1 code implementation • NeurIPS 2017 • Adith Swaminathan, Akshay Krishnamurthy, Alekh Agarwal, Miroslav Dudík, John Langford, Damien Jose, Imed Zitouni

This paper studies the evaluation of policies that recommend an ordered set of items (e. g., a ranking) based on some context---a common scenario in web search, ads, and recommendation.

1 code implementation • 14 Mar 2016 • David Abel, Alekh Agarwal, Fernando Diaz, Akshay Krishnamurthy, Robert E. Schapire

We address both of these challenges with two complementary techniques: First, we develop a gradient-boosting style, non-parametric function approximator for learning on $Q$-function residuals.

no code implementations • NeurIPS 2016 • Akshay Krishnamurthy, Alekh Agarwal, John Langford

We prove that the algorithm learns near optimal behavior after a number of episodes that is polynomial in all relevant parameters, logarithmic in the number of policies, and independent of the size of the observation space.

no code implementations • 8 Feb 2016 • Vasilis Syrgkanis, Akshay Krishnamurthy, Robert E. Schapire

We provide the first oracle efficient sublinear regret algorithms for adversarial versions of the contextual bandit problem.

no code implementations • NeurIPS 2015 • Kirthevasan Kandasamy, Akshay Krishnamurthy, Barnabas Poczos, Larry Wasserman, James M. Robins

We propose and analyse estimators for statistical functionals of one or moredistributions under nonparametric assumptions. Our estimators are derived from the von Mises expansion andare based on the theory of influence functions, which appearin the semiparametric statistics literature. We show that estimators based either on data-splitting or a leave-one-out techniqueenjoy fast rates of convergence and other favorable theoretical properties. We apply this framework to derive estimators for several popular informationtheoretic quantities, and via empirical evaluation, show the advantage of thisapproach over existing estimators.

no code implementations • 25 Jun 2015 • Akshay Krishnamurthy

We establish nearly matching upper and lower bounds on the minimax probability of error for any structured normal means problem, and we derive an optimality certificate for the maximum likelihood estimator, which can be applied to many instantiations.

no code implementations • 2 Jun 2015 • Martin Azizyan, Akshay Krishnamurthy, Aarti Singh

This paper studies the problem of estimating the covariance of a collection of vectors using only highly compressed measurements of each vector.

1 code implementation • NeurIPS 2016 • Akshay Krishnamurthy, Alekh Agarwal, Miroslav Dudik

We study an online decision making problem where on each round a learner chooses a list of items based on some side information, receives a scalar feedback value for each individual item, and a reward that is linearly related to this feedback.

no code implementations • 8 Feb 2015 • Kai-Wei Chang, Akshay Krishnamurthy, Alekh Agarwal, Hal Daumé III, John Langford

Methods for learning to search for structured prediction typically imitate a reference policy, with existing theoretical guarantees demonstrating low regret compared to that reference.

2 code implementations • 17 Nov 2014 • Kirthevasan Kandasamy, Akshay Krishnamurthy, Barnabas Poczos, Larry Wasserman, James M. Robins

We propose and analyze estimators for statistical functionals of one or more distributions under nonparametric assumptions.

no code implementations • 30 Oct 2014 • Akshay Krishnamurthy, Kirthevasan Kandasamy, Barnabas Poczos, Larry Wasserman

We give a comprehensive theoretical characterization of a nonparametric estimator for the $L_2^2$ divergence between two continuous distributions.

no code implementations • 14 Jul 2014 • Akshay Krishnamurthy, Aarti Singh

We show that adaptive sampling allows one to eliminate standard incoherence assumptions on the matrix row space that are necessary for passive sampling procedures.

no code implementations • 3 Apr 2014 • Akshay Krishnamurthy, Martin Azizyan, Aarti Singh

Our theoretical results show that even a constant number of measurements per column suffices to approximate the principal subspace to arbitrary precision, provided that the number of vectors is large.

no code implementations • 12 Feb 2014 • Akshay Krishnamurthy, Kirthevasan Kandasamy, Barnabas Poczos, Larry Wasserman

We consider nonparametric estimation of $L_2$, Renyi-$\alpha$ and Tsallis-$\alpha$ divergences between continuous distributions.

no code implementations • NeurIPS 2013 • James Sharpnack, Akshay Krishnamurthy, Aarti Singh

The detection of anomalous activity in graphs is a statistical problem that arises in many applications, such as network surveillance, disease outbreak detection, and activity monitoring in social networks.

no code implementations • 1 May 2013 • Akshay Krishnamurthy, James Sharpnack, Aarti Singh

We study the localization of a cluster of activated vertices in a graph, from adaptively designed compressive measurements.

no code implementations • NeurIPS 2013 • Akshay Krishnamurthy, Aarti Singh

In the absence of noise, we show that one can exactly recover a $n \times n$ matrix of rank $r$ from merely $\Omega(n r^{3/2}\log(r))$ matrix entries.

no code implementations • NeurIPS 2011 • Sivaraman Balakrishnan, Min Xu, Akshay Krishnamurthy, Aarti Singh

Although spectral clustering has enjoyed considerable empirical success in machine learning, its theoretical properties are not yet fully developed.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.