Search Results for author: Kevin Jamieson

Found 57 papers, 10 papers with code

Minimax Optimal Submodular Optimization with Bandit Feedback

no code implementations27 Oct 2023 Artin Tajdini, Lalit Jain, Kevin Jamieson

The objective is to minimize the learner's regret over $T$ times with respect to ($1-e^{-1}$)-approximation of maximum $f(S_*)$ with $|S_*| = k$, obtained through greedy maximization of $f$.

Optimal Exploration is no harder than Thompson Sampling

no code implementations9 Oct 2023 Zhaoqi Li, Kevin Jamieson, Lalit Jain

In this work, we pose a natural question: is there an algorithm that can explore optimally and only needs the same computational primitives as Thompson Sampling?

Thompson Sampling

Pick Planning Strategies for Large-Scale Package Manipulation

no code implementations23 Sep 2023 Shuai Li, Azarakhsh Keipour, Kevin Jamieson, Nicolas Hudson, Sicong Zhao, Charles Swan, Kostas Bekris

Automating warehouse operations can reduce logistics overhead costs, ultimately driving down the final price for consumers, increasing the speed of delivery, and enhancing the resiliency to market fluctuations.

A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity

no code implementations27 Jul 2023 Zhihan Xiong, Romain Camilleri, Maryam Fazel, Lalit Jain, Kevin Jamieson

For robust identification, it is well-known that if arms are chosen randomly and non-adaptively from a G-optimal design over $\mathcal{X}$ at each time then the error probability decreases as $\exp(-T\Delta^2_{(1)}/d)$, where $\Delta_{(1)} = \min_{x \neq x^*} (x^* - x)^\top \frac{1}{T}\sum_{t=1}^T \theta_t$.

Logarithmic Regret for Matrix Games against an Adversary with Noisy Bandit Feedback

1 code implementation22 Jun 2023 Arnab Maiti, Kevin Jamieson, Lillian J. Ratliff

If the row player uses the EXP3 strategy, an algorithm known for obtaining $\sqrt{T}$ regret against an arbitrary sequence of rewards, it is immediate that the row player also achieves $\sqrt{T}$ regret relative to the Nash equilibrium in this game setting.

Improved Active Multi-Task Representation Learning via Lasso

no code implementations5 Jun 2023 Yiping Wang, Yifang Chen, Kevin Jamieson, Simon S. Du

In addition to our sample complexity results, we also characterize the potential of our $\nu^1$-based strategy in sample-cost-sensitive settings.

Representation Learning

Demonstrating Large-Scale Package Manipulation via Learned Metrics of Pick Success

no code implementations17 May 2023 Shuai Li, Azarakhsh Keipour, Kevin Jamieson, Nicolas Hudson, Charles Swan, Kostas Bekris

This paper demonstrates a large-scale package manipulation from unstructured piles in Amazon Robotics' Robot Induction (Robin) fleet, which utilizes a pick success predictor trained on real production data.

Collision Avoidance

Instance-dependent Sample Complexity Bounds for Zero-sum Matrix Games

no code implementations19 Mar 2023 Arnab Maiti, Kevin Jamieson, Lillian J. Ratliff

We study the sample complexity of identifying an approximate equilibrium for two-player zero-sum $n\times 2$ matrix games.

Instance-Dependent Near-Optimal Policy Identification in Linear MDPs via Online Experiment Design

no code implementations6 Jul 2022 Andrew Wagenmaker, Kevin Jamieson

While much progress has been made in understanding the minimax sample complexity of reinforcement learning (RL) -- the complexity of learning on the "worst-case" instance -- such measures of complexity often do not capture the true difficulty of learning.

Reinforcement Learning (RL)

Instance-optimal PAC Algorithms for Contextual Bandits

no code implementations5 Jul 2022 Zhaoqi Li, Lillian Ratliff, Houssam Nassif, Kevin Jamieson, Lalit Jain

In the stochastic contextual bandit setting, regret-minimizing algorithms have been extensively researched, but their instance-minimizing best-arm identification counterparts remain seldom studied.

Multi-Armed Bandits

Active Learning with Safety Constraints

no code implementations22 Jun 2022 Romain Camilleri, Andrew Wagenmaker, Jamie Morgenstern, Lalit Jain, Kevin Jamieson

To our knowledge, our results are the first on best-arm identification in linear bandits with safety constraints.

Active Learning Decision Making +1

Active Multi-Task Representation Learning

no code implementations2 Feb 2022 Yifang Chen, Simon S. Du, Kevin Jamieson

To leverage the power of big data from source tasks and overcome the scarcity of the target task samples, representation learning based on multi-task pretraining has become a standard approach in many applications.

Active Learning Multi-Task Learning +1

Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes

no code implementations26 Jan 2022 Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson

We first develop a computationally efficient algorithm for reward-free RL in a $d$-dimensional linear MDP with sample complexity scaling as $\widetilde{\mathcal{O}}(d^2 H^5/\epsilon^2)$.

Reinforcement Learning (RL)

First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach

no code implementations7 Dec 2021 Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson

Obtaining first-order regret bounds -- regret bounds scaling not as the worst-case but with some measure of the performance of the optimal policy on a given instance -- is a core question in sequential decision-making.

Decision Making reinforcement-learning +1

Best Arm Identification with Safety Constraints

1 code implementation23 Nov 2021 Zhenlin Wang, Andrew Wagenmaker, Kevin Jamieson

The best arm identification problem in the multi-armed bandit setting is an excellent model of many real-world decision-making problems, yet it fails to capture the fact that in the real-world, safety constraints often must be met while learning.

Decision Making

Practical, Provably-Correct Interactive Learning in the Realizable Setting: The Power of True Believers

no code implementations NeurIPS 2021 Julian Katz-Samuels, Blake Mason, Kevin Jamieson, Rob Nowak

We begin our investigation with the observation that agnostic algorithms \emph{cannot} be minimax-optimal in the realizable setting.

Nearly Optimal Algorithms for Level Set Estimation

no code implementations2 Nov 2021 Blake Mason, Romain Camilleri, Subhojyoti Mukherjee, Kevin Jamieson, Robert Nowak, Lalit Jain

The threshold value $\alpha$ can either be \emph{explicit} and provided a priori, or \emph{implicit} and defined relative to the optimal function value, i. e. $\alpha = (1-\epsilon)f(x_\ast)$ for a given $\epsilon > 0$ where $f(x_\ast)$ is the maximal function value and is unknown.

Experimental Design

Selective Sampling for Online Best-arm Identification

no code implementations NeurIPS 2021 Romain Camilleri, Zhihan Xiong, Maryam Fazel, Lalit Jain, Kevin Jamieson

The main results of this work precisely characterize this trade-off between labeled samples and stopping time and provide an algorithm that nearly-optimally achieves the minimal label complexity given a desired stopping time.

Binary Classification

Beyond No Regret: Instance-Dependent PAC Reinforcement Learning

no code implementations5 Aug 2021 Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

We show this is not possible -- there exists a fundamental tradeoff between achieving low regret and identifying an $\epsilon$-optimal policy at the instance-optimal rate.

reinforcement-learning Reinforcement Learning (RL)

Corruption Robust Active Learning

no code implementations NeurIPS 2021 Yifang Chen, Simon S. Du, Kevin Jamieson

We conduct theoretical studies on streaming-based active learning for binary classification under unknown adversarial label corruptions.

Active Learning Binary Classification

High-Dimensional Experimental Design and Kernel Bandits

no code implementations12 May 2021 Romain Camilleri, Julian Katz-Samuels, Kevin Jamieson

We also leverage our new approach in a new algorithm for kernelized bandits to obtain state of the art results for regret minimization and pure exploration.

Experimental Design Vocal Bursts Intensity Prediction

Improved Corruption Robust Algorithms for Episodic Reinforcement Learning

no code implementations13 Feb 2021 Yifang Chen, Simon S. Du, Kevin Jamieson

We study episodic reinforcement learning under unknown adversarial corruptions in both the rewards and the transition probabilities of the underlying system.

reinforcement-learning Reinforcement Learning (RL)

Task-Optimal Exploration in Linear Dynamical Systems

no code implementations10 Feb 2021 Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

Along the way, we establish that certainty equivalence decision making is instance- and task-optimal, and obtain the first algorithm for the linear quadratic regulator problem which is instance-optimal.

Decision Making

Leveraging Post Hoc Context for Faster Learning in Bandit Settings with Applications in Robot-Assisted Feeding

no code implementations5 Nov 2020 Ethan K. Gordon, Sumegh Roychowdhury, Tapomayukh Bhattacharjee, Kevin Jamieson, Siddhartha S. Srinivasa

Our key insight is that we can leverage the haptic context we collect during and after manipulation (i. e., "post hoc") to learn some of these properties and more quickly adapt our visual model to previously unseen food.

Experimental Design for Regret Minimization in Linear Bandits

no code implementations1 Nov 2020 Andrew Wagenmaker, Julian Katz-Samuels, Kevin Jamieson

In this paper we propose a novel experimental design-based algorithm to minimize regret in online stochastic linear and combinatorial bandits.

Experimental Design

Learning to Actively Learn: A Robust Approach

no code implementations29 Oct 2020 Jifan Zhang, Lalit Jain, Kevin Jamieson

Unlike the design of traditional adaptive algorithms that rely on concentration of measure and careful analysis to justify the correctness and sample complexity of the procedure, our adaptive algorithm is learned via adversarial training over equivalence classes of problems derived from information theoretic lower bounds.

Active Learning Meta-Learning +1

A New Perspective on Pool-Based Active Classification and False-Discovery Control

no code implementations NeurIPS 2019 Lalit Jain, Kevin Jamieson

In many scientific settings there is a need for adaptive experimental design to guide the process of identifying regions of the search space that contain as many true positives as possible subject to a low rate of false discoveries (i. e. false alarms).

Active Learning Binary Classification +3

Estimating the number and effect sizes of non-null hypotheses

1 code implementation ICML 2020 Jennifer Brennan, Ramya Korlakai Vinayak, Kevin Jamieson

We study the problem of estimating the distribution of effect sizes (the mean of the test statistic under the alternate hypothesis) in a multiple testing setting.

Experimental Design Test

Active Learning for Identification of Linear Dynamical Systems

no code implementations2 Feb 2020 Andrew Wagenmaker, Kevin Jamieson

We propose an algorithm to actively estimate the parameters of a linear dynamical system.

Active Learning

Mosaic: A Sample-Based Database System for Open World Query Processing

no code implementations17 Dec 2019 Laurel Orr, Samuel Ainsworth, Walter Cai, Kevin Jamieson, Magda Balazinska, Dan Suciu

Recently, with the increase in the number of public data repositories, sample data has become easier to access.

Sequential Experimental Design for Transductive Linear Bandits

1 code implementation NeurIPS 2019 Tanner Fiez, Lalit Jain, Kevin Jamieson, Lillian Ratliff

Such a transductive setting naturally arises when the set of measurement vectors is limited due to factors such as availability or cost.

Drug Discovery Experimental Design +1

The True Sample Complexity of Identifying Good Arms

no code implementations15 Jun 2019 Julian Katz-Samuels, Kevin Jamieson

We consider two multi-armed bandit problems with $n$ arms: (i) given an $\epsilon > 0$, identify an arm with mean that is within $\epsilon$ of the largest mean and (ii) given a threshold $\mu_0$ and integer $k$, identify $k$ arms with means larger than $\mu_0$.

Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs

no code implementations NeurIPS 2019 Max Simchowitz, Kevin Jamieson

This paper establishes that optimistic algorithms attain gap-dependent and non-asymptotic logarithmic regret for episodic MDPs.

Exploiting Reuse in Pipeline-Aware Hyperparameter Tuning

no code implementations12 Mar 2019 Liam Li, Evan Sparks, Kevin Jamieson, Ameet Talwalkar

Hyperparameter tuning of multi-stage pipelines introduces a significant computational burden.

Pure-Exploration for Infinite-Armed Bandits with General Arm Reservoirs

no code implementations15 Nov 2018 Maryam Aziz, Kevin Jamieson, Javed Aslam

This paper considers a multi-armed bandit game where the number of arms is much larger than the maximum budget and is effectively infinite.

A Bandit Approach to Multiple Testing with False Discovery Control

no code implementations6 Sep 2018 Kevin Jamieson, Lalit Jain

We propose an adaptive sampling approach for multiple testing which aims to maximize statistical power while ensuring anytime false discovery control.

Drug Discovery

Adaptive Sampling for Convex Regression

no code implementations14 Aug 2018 Max Simchowitz, Kevin Jamieson, Jordan W. Suchow, Thomas L. Griffiths

In this paper, we introduce the first principled adaptive-sampling procedure for learning a convex function in the $L_\infty$ norm, a problem that arises often in the behavioral and social sciences.


Firing Bandits: Optimizing Crowdfunding

no code implementations ICML 2018 Lalit Jain, Kevin Jamieson

In this paper, we model the problem of optimizing crowdfunding platforms, such as the non-profit Kiva or for-profit KickStarter, as a variant of the multi-armed bandit problem.

Massively Parallel Hyperparameter Tuning

no code implementations ICLR 2018 Lisha Li, Kevin Jamieson, Afshin Rostamizadeh, Katya Gonina, Moritz Hardt, Benjamin Recht, Ameet Talwalkar

Modern machine learning models are characterized by large hyperparameter search spaces and prohibitively expensive training costs.

A framework for Multi-A(rmed)/B(andit) testing with online FDR control

1 code implementation NeurIPS 2017 Fanny Yang, Aaditya Ramdas, Kevin Jamieson, Martin J. Wainwright

We propose an alternative framework to existing setups for controlling false alarms when multiple A/B tests are run over time.

Test valid

Open Loop Hyperparameter Optimization and Determinantal Point Processes

no code implementations ICLR 2018 Jesse Dodge, Kevin Jamieson, Noah A. Smith

Driven by the need for parallelizable hyperparameter optimization methods, this paper studies \emph{open loop} search methods: sequences that are predetermined and can be generated before a single configuration is evaluated.

Hyperparameter Optimization Point Processes

The Simulator: Understanding Adaptive Sampling in the Moderate-Confidence Regime

no code implementations16 Feb 2017 Max Simchowitz, Kevin Jamieson, Benjamin Recht

Moreover, our lower bounds zero-in on the number of times each \emph{individual} arm needs to be pulled, uncovering new phenomena which are drowned out in the aggregate sample complexity.

Comparing Human-Centric and Robot-Centric Sampling for Robot Deep Learning from Demonstrations

no code implementations4 Oct 2016 Michael Laskey, Caleb Chuck, Jonathan Lee, Jeffrey Mahler, Sanjay Krishnan, Kevin Jamieson, Anca Dragan, Ken Goldberg

Although policies learned with RC sampling can be superior to HC sampling for standard learning models such as linear SVMs, policies learned with HC sampling may be comparable with highly-expressive learning models such as deep learning and hyper-parametric decision trees, which have little model error.

Finite Sample Prediction and Recovery Bounds for Ordinal Embedding

no code implementations NeurIPS 2016 Lalit Jain, Kevin Jamieson, Robert Nowak

First, we derive prediction error bounds for ordinal embedding with noise by exploiting the fact that the rank of a distance matrix of points in $\mathbb{R}^d$ is at most $d+2$.

On the Detection of Mixture Distributions with applications to the Most Biased Coin Problem

no code implementations25 Mar 2016 Kevin Jamieson, Daniel Haas, Ben Recht

As a result, these bounds have surprising implications both for solutions to the most biased coin problem and for anomaly detection when only partial information about the parameters is known.

Anomaly Detection

Best-of-K Bandits

no code implementations9 Mar 2016 Max Simchowitz, Kevin Jamieson, Benjamin Recht

This paper studies the Best-of-K Bandit game: At each time the player chooses a subset S among all N-choose-K possible options and observes reward max(X(i) : i in S) where X is a random vector drawn from a joint distribution.

Non-stochastic Best Arm Identification and Hyperparameter Optimization

1 code implementation27 Feb 2015 Kevin Jamieson, Ameet Talwalkar

Motivated by the task of hyperparameter optimization, we introduce the non-stochastic best-arm identification problem.

Hyperparameter Optimization Test

Sparse Dueling Bandits

no code implementations31 Jan 2015 Kevin Jamieson, Sumeet Katariya, Atul Deshpande, Robert Nowak

We prove that in the absence of structural assumptions, the sample complexity of this problem is proportional to the sum of the inverse squared gaps between the Borda scores of each suboptimal arm and the best arm.

lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits

no code implementations27 Dec 2013 Kevin Jamieson, Matthew Malloy, Robert Nowak, Sébastien Bubeck

The paper proposes a novel upper confidence bound (UCB) procedure for identifying the arm with the largest mean in a multi-armed bandit game in the fixed confidence setting using a small number of total samples.

Multi-Armed Bandits

On Finding the Largest Mean Among Many

no code implementations17 Jun 2013 Kevin Jamieson, Matthew Malloy, Robert Nowak, Sebastien Bubeck

Motivated by large-scale applications, we are especially interested in identifying situations where the total number of samples that are necessary and sufficient to find the best arm scale linearly with the number of arms.

Multi-Armed Bandits

Cannot find the paper you are looking for? You can Submit a new open access paper.