You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

no code implementations • 27 Oct 2023 • Artin Tajdini, Lalit Jain, Kevin Jamieson

The objective is to minimize the learner's regret over $T$ times with respect to ($1-e^{-1}$)-approximation of maximum $f(S_*)$ with $|S_*| = k$, obtained through greedy maximization of $f$.

1 code implementation • 25 Oct 2023 • Arnab Maiti, Ross Boczar, Kevin Jamieson, Lillian J. Ratliff

We design a near-optimal algorithm whose sample complexity matches the lower bound, up to log factors.

no code implementations • 9 Oct 2023 • Zhaoqi Li, Kevin Jamieson, Lalit Jain

In this work, we pose a natural question: is there an algorithm that can explore optimally and only needs the same computational primitives as Thompson Sampling?

no code implementations • 23 Sep 2023 • Shuai Li, Azarakhsh Keipour, Kevin Jamieson, Nicolas Hudson, Sicong Zhao, Charles Swan, Kostas Bekris

Automating warehouse operations can reduce logistics overhead costs, ultimately driving down the final price for consumers, increasing the speed of delivery, and enhancing the resiliency to market fluctuations.

no code implementations • 27 Jul 2023 • Zhihan Xiong, Romain Camilleri, Maryam Fazel, Lalit Jain, Kevin Jamieson

For robust identification, it is well-known that if arms are chosen randomly and non-adaptively from a G-optimal design over $\mathcal{X}$ at each time then the error probability decreases as $\exp(-T\Delta^2_{(1)}/d)$, where $\Delta_{(1)} = \min_{x \neq x^*} (x^* - x)^\top \frac{1}{T}\sum_{t=1}^T \theta_t$.

1 code implementation • 22 Jun 2023 • Arnab Maiti, Kevin Jamieson, Lillian J. Ratliff

If the row player uses the EXP3 strategy, an algorithm known for obtaining $\sqrt{T}$ regret against an arbitrary sequence of rewards, it is immediate that the row player also achieves $\sqrt{T}$ regret relative to the Nash equilibrium in this game setting.

1 code implementation • 16 Jun 2023 • Jifan Zhang, Yifang Chen, Gregory Canal, Stephen Mussmann, Yinglun Zhu, Simon Shaolei Du, Kevin Jamieson, Robert D Nowak

Labeled data are critical to modern machine learning applications, but obtaining labels can be expensive.

no code implementations • 5 Jun 2023 • Yiping Wang, Yifang Chen, Kevin Jamieson, Simon S. Du

In addition to our sample complexity results, we also characterize the potential of our $\nu^1$-based strategy in sample-cost-sensitive settings.

no code implementations • 17 May 2023 • Shuai Li, Azarakhsh Keipour, Kevin Jamieson, Nicolas Hudson, Charles Swan, Kostas Bekris

This paper demonstrates a large-scale package manipulation from unstructured piles in Amazon Robotics' Robot Induction (Robin) fleet, which utilizes a pick success predictor trained on real production data.

no code implementations • 19 Mar 2023 • Arnab Maiti, Kevin Jamieson, Lillian J. Ratliff

We study the sample complexity of identifying an approximate equilibrium for two-player zero-sum $n\times 2$ matrix games.

no code implementations • 6 Jul 2022 • Andrew Wagenmaker, Kevin Jamieson

While much progress has been made in understanding the minimax sample complexity of reinforcement learning (RL) -- the complexity of learning on the "worst-case" instance -- such measures of complexity often do not capture the true difficulty of learning.

no code implementations • 5 Jul 2022 • Zhaoqi Li, Lillian Ratliff, Houssam Nassif, Kevin Jamieson, Lalit Jain

In the stochastic contextual bandit setting, regret-minimizing algorithms have been extensively researched, but their instance-minimizing best-arm identification counterparts remain seldom studied.

no code implementations • 22 Jun 2022 • Romain Camilleri, Andrew Wagenmaker, Jamie Morgenstern, Lalit Jain, Kevin Jamieson

To our knowledge, our results are the first on best-arm identification in linear bandits with safety constraints.

no code implementations • 2 Feb 2022 • Yifang Chen, Simon S. Du, Kevin Jamieson

To leverage the power of big data from source tasks and overcome the scarcity of the target task samples, representation learning based on multi-task pretraining has become a standard approach in many applications.

no code implementations • 26 Jan 2022 • Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson

We first develop a computationally efficient algorithm for reward-free RL in a $d$-dimensional linear MDP with sample complexity scaling as $\widetilde{\mathcal{O}}(d^2 H^5/\epsilon^2)$.

no code implementations • 7 Dec 2021 • Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson

Obtaining first-order regret bounds -- regret bounds scaling not as the worst-case but with some measure of the performance of the optimal policy on a given instance -- is a core question in sequential decision-making.

1 code implementation • 23 Nov 2021 • Zhenlin Wang, Andrew Wagenmaker, Kevin Jamieson

The best arm identification problem in the multi-armed bandit setting is an excellent model of many real-world decision-making problems, yet it fails to capture the fact that in the real-world, safety constraints often must be met while learning.

no code implementations • NeurIPS 2021 • Julian Katz-Samuels, Blake Mason, Kevin Jamieson, Rob Nowak

We begin our investigation with the observation that agnostic algorithms \emph{cannot} be minimax-optimal in the realizable setting.

no code implementations • 2 Nov 2021 • Blake Mason, Romain Camilleri, Subhojyoti Mukherjee, Kevin Jamieson, Robert Nowak, Lalit Jain

The threshold value $\alpha$ can either be \emph{explicit} and provided a priori, or \emph{implicit} and defined relative to the optimal function value, i. e. $\alpha = (1-\epsilon)f(x_\ast)$ for a given $\epsilon > 0$ where $f(x_\ast)$ is the maximal function value and is unknown.

no code implementations • NeurIPS 2021 • Romain Camilleri, Zhihan Xiong, Maryam Fazel, Lalit Jain, Kevin Jamieson

The main results of this work precisely characterize this trade-off between labeled samples and stopping time and provide an algorithm that nearly-optimally achieves the minimal label complexity given a desired stopping time.

no code implementations • 5 Aug 2021 • Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

We show this is not possible -- there exists a fundamental tradeoff between achieving low regret and identifying an $\epsilon$-optimal policy at the instance-optimal rate.

no code implementations • NeurIPS 2021 • Yifang Chen, Simon S. Du, Kevin Jamieson

We conduct theoretical studies on streaming-based active learning for binary classification under unknown adversarial label corruptions.

no code implementations • 13 May 2021 • Julian Katz-Samuels, Jifan Zhang, Lalit Jain, Kevin Jamieson

We consider active learning for binary classification in the agnostic pool-based setting.

no code implementations • 12 May 2021 • Romain Camilleri, Julian Katz-Samuels, Kevin Jamieson

We also leverage our new approach in a new algorithm for kernelized bandits to obtain state of the art results for regret minimization and pure exploration.

no code implementations • 13 Feb 2021 • Yifang Chen, Simon S. Du, Kevin Jamieson

We study episodic reinforcement learning under unknown adversarial corruptions in both the rewards and the transition probabilities of the underlying system.

no code implementations • 10 Feb 2021 • Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

Along the way, we establish that certainty equivalence decision making is instance- and task-optimal, and obtain the first algorithm for the linear quadratic regulator problem which is instance-optimal.

no code implementations • 5 Nov 2020 • Ethan K. Gordon, Sumegh Roychowdhury, Tapomayukh Bhattacharjee, Kevin Jamieson, Siddhartha S. Srinivasa

Our key insight is that we can leverage the haptic context we collect during and after manipulation (i. e., "post hoc") to learn some of these properties and more quickly adapt our visual model to previously unseen food.

no code implementations • 1 Nov 2020 • Andrew Wagenmaker, Julian Katz-Samuels, Kevin Jamieson

In this paper we propose a novel experimental design-based algorithm to minimize regret in online stochastic linear and combinatorial bandits.

no code implementations • 29 Oct 2020 • Jifan Zhang, Lalit Jain, Kevin Jamieson

Unlike the design of traditional adaptive algorithms that rely on concentration of measure and careful analysis to justify the correctness and sample complexity of the procedure, our adaptive algorithm is learned via adversarial training over equivalence classes of problems derived from information theoretic lower bounds.

no code implementations • NeurIPS 2019 • Lalit Jain, Kevin Jamieson

In many scientific settings there is a need for adaptive experimental design to guide the process of identifying regions of the search space that contain as many true positives as possible subject to a low rate of false discoveries (i. e. false alarms).

no code implementations • NeurIPS 2020 • Julian Katz-Samuels, Lalit Jain, Zohar Karnin, Kevin Jamieson

This paper proposes near-optimal algorithms for the pure-exploration linear bandit problem in the fixed confidence and fixed budget settings.

1 code implementation • ICML 2020 • Jennifer Brennan, Ramya Korlakai Vinayak, Kevin Jamieson

We study the problem of estimating the distribution of effect sizes (the mean of the test statistic under the alternate hypothesis) in a multiple testing setting.

no code implementations • 2 Feb 2020 • Andrew Wagenmaker, Kevin Jamieson

We propose an algorithm to actively estimate the parameters of a linear dynamical system.

no code implementations • 17 Dec 2019 • Laurel Orr, Samuel Ainsworth, Walter Cai, Kevin Jamieson, Magda Balazinska, Dan Suciu

Recently, with the increase in the number of public data repositories, sample data has become easier to access.

1 code implementation • NeurIPS 2019 • Tanner Fiez, Lalit Jain, Kevin Jamieson, Lillian Ratliff

Such a transductive setting naturally arises when the set of measurement vectors is limited due to factors such as availability or cost.

no code implementations • 15 Jun 2019 • Julian Katz-Samuels, Kevin Jamieson

We consider two multi-armed bandit problems with $n$ arms: (i) given an $\epsilon > 0$, identify an arm with mean that is within $\epsilon$ of the largest mean and (ii) given a threshold $\mu_0$ and integer $k$, identify $k$ arms with means larger than $\mu_0$.

no code implementations • NeurIPS 2019 • Max Simchowitz, Kevin Jamieson

This paper establishes that optimistic algorithms attain gap-dependent and non-asymptotic logarithmic regret for episodic MDPs.

no code implementations • 29 Mar 2019 • Alexander Ratner, Dan Alistarh, Gustavo Alonso, David G. Andersen, Peter Bailis, Sarah Bird, Nicholas Carlini, Bryan Catanzaro, Jennifer Chayes, Eric Chung, Bill Dally, Jeff Dean, Inderjit S. Dhillon, Alexandros Dimakis, Pradeep Dubey, Charles Elkan, Grigori Fursin, Gregory R. Ganger, Lise Getoor, Phillip B. Gibbons, Garth A. Gibson, Joseph E. Gonzalez, Justin Gottschlich, Song Han, Kim Hazelwood, Furong Huang, Martin Jaggi, Kevin Jamieson, Michael. I. Jordan, Gauri Joshi, Rania Khalaf, Jason Knight, Jakub Konečný, Tim Kraska, Arun Kumar, Anastasios Kyrillidis, Aparna Lakshmiratan, Jing Li, Samuel Madden, H. Brendan McMahan, Erik Meijer, Ioannis Mitliagkas, Rajat Monga, Derek Murray, Kunle Olukotun, Dimitris Papailiopoulos, Gennady Pekhimenko, Theodoros Rekatsinas, Afshin Rostamizadeh, Christopher Ré, Christopher De Sa, Hanie Sedghi, Siddhartha Sen, Virginia Smith, Alex Smola, Dawn Song, Evan Sparks, Ion Stoica, Vivienne Sze, Madeleine Udell, Joaquin Vanschoren, Shivaram Venkataraman, Rashmi Vinayak, Markus Weimer, Andrew Gordon Wilson, Eric Xing, Matei Zaharia, Ce Zhang, Ameet Talwalkar

Machine learning (ML) techniques are enjoying rapidly increasing adoption.

no code implementations • 12 Mar 2019 • Liam Li, Evan Sparks, Kevin Jamieson, Ameet Talwalkar

Hyperparameter tuning of multi-stage pipelines introduces a significant computational burden.

no code implementations • 15 Nov 2018 • Maryam Aziz, Kevin Jamieson, Javed Aslam

This paper considers a multi-armed bandit game where the number of arms is much larger than the maximum budget and is effectively infinite.

5 code implementations • ICLR 2018 • Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekaterina Gonina, Moritz Hardt, Benjamin Recht, Ameet Talwalkar

Modern learning models are characterized by large hyperparameter spaces and long training times.

no code implementations • 6 Sep 2018 • Kevin Jamieson, Lalit Jain

We propose an adaptive sampling approach for multiple testing which aims to maximize statistical power while ensuring anytime false discovery control.

no code implementations • 14 Aug 2018 • Max Simchowitz, Kevin Jamieson, Jordan W. Suchow, Thomas L. Griffiths

In this paper, we introduce the first principled adaptive-sampling procedure for learning a convex function in the $L_\infty$ norm, a problem that arises often in the behavioral and social sciences.

no code implementations • ICML 2018 • Lalit Jain, Kevin Jamieson

In this paper, we model the problem of optimizing crowdfunding platforms, such as the non-profit Kiva or for-profit KickStarter, as a variant of the multi-armed bandit problem.

no code implementations • ICLR 2018 • Lisha Li, Kevin Jamieson, Afshin Rostamizadeh, Katya Gonina, Moritz Hardt, Benjamin Recht, Ameet Talwalkar

Modern machine learning models are characterized by large hyperparameter search spaces and prohibitively expensive training costs.

1 code implementation • NeurIPS 2017 • Fanny Yang, Aaditya Ramdas, Kevin Jamieson, Martin J. Wainwright

We propose an alternative framework to existing setups for controlling false alarms when multiple A/B tests are run over time.

no code implementations • ICLR 2018 • Jesse Dodge, Kevin Jamieson, Noah A. Smith

Driven by the need for parallelizable hyperparameter optimization methods, this paper studies \emph{open loop} search methods: sequences that are predetermined and can be generated before a single configuration is evaluated.

no code implementations • 16 Feb 2017 • Max Simchowitz, Kevin Jamieson, Benjamin Recht

Moreover, our lower bounds zero-in on the number of times each \emph{individual} arm needs to be pulled, uncovering new phenomena which are drowned out in the aggregate sample complexity.

no code implementations • 4 Oct 2016 • Michael Laskey, Caleb Chuck, Jonathan Lee, Jeffrey Mahler, Sanjay Krishnan, Kevin Jamieson, Anca Dragan, Ken Goldberg

Although policies learned with RC sampling can be superior to HC sampling for standard learning models such as linear SVMs, policies learned with HC sampling may be comparable with highly-expressive learning models such as deep learning and hyper-parametric decision trees, which have little model error.

no code implementations • NeurIPS 2016 • Lalit Jain, Kevin Jamieson, Robert Nowak

First, we derive prediction error bounds for ordinal embedding with noise by exploiting the fact that the rank of a distance matrix of points in $\mathbb{R}^d$ is at most $d+2$.

no code implementations • 25 Mar 2016 • Kevin Jamieson, Daniel Haas, Ben Recht

As a result, these bounds have surprising implications both for solutions to the most biased coin problem and for anomaly detection when only partial information about the parameters is known.

17 code implementations • 21 Mar 2016 • Lisha Li, Kevin Jamieson, Giulia Desalvo, Afshin Rostamizadeh, Ameet Talwalkar

Performance of machine learning algorithms depends critically on identifying a good set of hyperparameters.

no code implementations • 9 Mar 2016 • Max Simchowitz, Kevin Jamieson, Benjamin Recht

This paper studies the Best-of-K Bandit game: At each time the player chooses a subset S among all N-choose-K possible options and observes reward max(X(i) : i in S) where X is a random vector drawn from a joint distribution.

1 code implementation • 27 Feb 2015 • Kevin Jamieson, Ameet Talwalkar

Motivated by the task of hyperparameter optimization, we introduce the non-stochastic best-arm identification problem.

no code implementations • 31 Jan 2015 • Kevin Jamieson, Sumeet Katariya, Atul Deshpande, Robert Nowak

We prove that in the absence of structural assumptions, the sample complexity of this problem is proportional to the sum of the inverse squared gaps between the Borda scores of each suboptimal arm and the best arm.

no code implementations • 27 Dec 2013 • Kevin Jamieson, Matthew Malloy, Robert Nowak, Sébastien Bubeck

The paper proposes a novel upper confidence bound (UCB) procedure for identifying the arm with the largest mean in a multi-armed bandit game in the fixed confidence setting using a small number of total samples.

no code implementations • 17 Jun 2013 • Kevin Jamieson, Matthew Malloy, Robert Nowak, Sebastien Bubeck

Motivated by large-scale applications, we are especially interested in identifying situations where the total number of samples that are necessary and sufficient to find the best arm scale linearly with the number of arms.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.