Search Results for author: Kevin Jamieson

Found 44 papers, 7 papers with code

Active Multi-Task Representation Learning

no code implementations2 Feb 2022 Yifang Chen, Simon S. Du, Kevin Jamieson

To leverage the power of big data from source tasks and overcome the scarcity of the target task samples, representation learning based on multi-task pretraining has become a standard approach in many applications.

Active Learning Multi-Task Learning +1

First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach

no code implementations7 Dec 2021 Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson

Obtaining first-order regret bounds -- regret bounds scaling not as the worst-case but with some measure of the performance of the optimal policy on a given instance -- is a core question in sequential decision-making.

Decision Making reinforcement-learning

Best Arm Identification with Safety Constraints

1 code implementation23 Nov 2021 Zhenlin Wang, Andrew Wagenmaker, Kevin Jamieson

The best arm identification problem in the multi-armed bandit setting is an excellent model of many real-world decision-making problems, yet it fails to capture the fact that in the real-world, safety constraints often must be met while learning.

Decision Making

Practical, Provably-Correct Interactive Learning in the Realizable Setting: The Power of True Believers

no code implementations NeurIPS 2021 Julian Katz-Samuels, Blake Mason, Kevin Jamieson, Rob Nowak

We begin our investigation with the observation that agnostic algorithms \emph{cannot} be minimax-optimal in the realizable setting.

Nearly Optimal Algorithms for Level Set Estimation

no code implementations2 Nov 2021 Blake Mason, Romain Camilleri, Subhojyoti Mukherjee, Kevin Jamieson, Robert Nowak, Lalit Jain

The threshold value $\alpha$ can either be \emph{explicit} and provided a priori, or \emph{implicit} and defined relative to the optimal function value, i. e. $\alpha = (1-\epsilon)f(x_\ast)$ for a given $\epsilon > 0$ where $f(x_\ast)$ is the maximal function value and is unknown.

Experimental Design

Selective Sampling for Online Best-arm Identification

no code implementations NeurIPS 2021 Romain Camilleri, Zhihan Xiong, Maryam Fazel, Lalit Jain, Kevin Jamieson

The main results of this work precisely characterize this trade-off between labeled samples and stopping time and provide an algorithm that nearly-optimally achieves the minimal label complexity given a desired stopping time.

Beyond No Regret: Instance-Dependent PAC Reinforcement Learning

no code implementations5 Aug 2021 Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

We show that this is not possible -- there exists a fundamental tradeoff between achieving low regret and identifying an $\epsilon$-optimal policy at the instance-optimal rate.

reinforcement-learning

Corruption Robust Active Learning

no code implementations NeurIPS 2021 Yifang Chen, Simon S. Du, Kevin Jamieson

We conduct theoretical studies on streaming-based active learning for binary classification under unknown adversarial label corruptions.

Active Learning

High-Dimensional Experimental Design and Kernel Bandits

no code implementations12 May 2021 Romain Camilleri, Julian Katz-Samuels, Kevin Jamieson

We also leverage our new approach in a new algorithm for kernelized bandits to obtain state of the art results for regret minimization and pure exploration.

Experimental Design

Improved Corruption Robust Algorithms for Episodic Reinforcement Learning

no code implementations13 Feb 2021 Yifang Chen, Simon S. Du, Kevin Jamieson

We study episodic reinforcement learning under unknown adversarial corruptions in both the rewards and the transition probabilities of the underlying system.

reinforcement-learning

Task-Optimal Exploration in Linear Dynamical Systems

no code implementations10 Feb 2021 Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

Along the way, we establish that certainty equivalence decision making is instance- and task-optimal, and obtain the first algorithm for the linear quadratic regulator problem which is instance-optimal.

Decision Making

Leveraging Post Hoc Context for Faster Learning in Bandit Settings with Applications in Robot-Assisted Feeding

no code implementations5 Nov 2020 Ethan K. Gordon, Sumegh Roychowdhury, Tapomayukh Bhattacharjee, Kevin Jamieson, Siddhartha S. Srinivasa

Our key insight is that we can leverage the haptic context we collect during and after manipulation (i. e., "post hoc") to learn some of these properties and more quickly adapt our visual model to previously unseen food.

Experimental Design for Regret Minimization in Linear Bandits

no code implementations1 Nov 2020 Andrew Wagenmaker, Julian Katz-Samuels, Kevin Jamieson

In this paper we propose a novel experimental design-based algorithm to minimize regret in online stochastic linear and combinatorial bandits.

Experimental Design

Learning to Actively Learn: A Robust Approach

no code implementations29 Oct 2020 Jifan Zhang, Lalit Jain, Kevin Jamieson

Unlike the design of traditional adaptive algorithms that rely on concentration of measure and careful analysis to justify the correctness and sample complexity of the procedure, our adaptive algorithm is learned via adversarial training over equivalence classes of problems derived from information theoretic lower bounds.

Active Learning Meta-Learning +1

A New Perspective on Pool-Based Active Classification and False-Discovery Control

no code implementations NeurIPS 2019 Lalit Jain, Kevin Jamieson

In many scientific settings there is a need for adaptive experimental design to guide the process of identifying regions of the search space that contain as many true positives as possible subject to a low rate of false discoveries (i. e. false alarms).

Active Learning Classification +2

Estimating the number and effect sizes of non-null hypotheses

1 code implementation ICML 2020 Jennifer Brennan, Ramya Korlakai Vinayak, Kevin Jamieson

We study the problem of estimating the distribution of effect sizes (the mean of the test statistic under the alternate hypothesis) in a multiple testing setting.

Experimental Design

Active Learning for Identification of Linear Dynamical Systems

no code implementations2 Feb 2020 Andrew Wagenmaker, Kevin Jamieson

We propose an algorithm to actively estimate the parameters of a linear dynamical system.

Active Learning

Mosaic: A Sample-Based Database System for Open World Query Processing

no code implementations17 Dec 2019 Laurel Orr, Samuel Ainsworth, Walter Cai, Kevin Jamieson, Magda Balazinska, Dan Suciu

Recently, with the increase in the number of public data repositories, sample data has become easier to access.

Sequential Experimental Design for Transductive Linear Bandits

1 code implementation NeurIPS 2019 Tanner Fiez, Lalit Jain, Kevin Jamieson, Lillian Ratliff

Such a transductive setting naturally arises when the set of measurement vectors is limited due to factors such as availability or cost.

Drug Discovery Experimental Design +1

The True Sample Complexity of Identifying Good Arms

no code implementations15 Jun 2019 Julian Katz-Samuels, Kevin Jamieson

We consider two multi-armed bandit problems with $n$ arms: (i) given an $\epsilon > 0$, identify an arm with mean that is within $\epsilon$ of the largest mean and (ii) given a threshold $\mu_0$ and integer $k$, identify $k$ arms with means larger than $\mu_0$.

Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs

no code implementations NeurIPS 2019 Max Simchowitz, Kevin Jamieson

This paper establishes that optimistic algorithms attain gap-dependent and non-asymptotic logarithmic regret for episodic MDPs.

Exploiting Reuse in Pipeline-Aware Hyperparameter Tuning

no code implementations12 Mar 2019 Liam Li, Evan Sparks, Kevin Jamieson, Ameet Talwalkar

Hyperparameter tuning of multi-stage pipelines introduces a significant computational burden.

Pure-Exploration for Infinite-Armed Bandits with General Arm Reservoirs

no code implementations15 Nov 2018 Maryam Aziz, Kevin Jamieson, Javed Aslam

This paper considers a multi-armed bandit game where the number of arms is much larger than the maximum budget and is effectively infinite.

A Bandit Approach to Multiple Testing with False Discovery Control

no code implementations6 Sep 2018 Kevin Jamieson, Lalit Jain

We propose an adaptive sampling approach for multiple testing which aims to maximize statistical power while ensuring anytime false discovery control.

Drug Discovery

Adaptive Sampling for Convex Regression

no code implementations14 Aug 2018 Max Simchowitz, Kevin Jamieson, Jordan W. Suchow, Thomas L. Griffiths

In this paper, we introduce the first principled adaptive-sampling procedure for learning a convex function in the $L_\infty$ norm, a problem that arises often in the behavioral and social sciences.

Firing Bandits: Optimizing Crowdfunding

no code implementations ICML 2018 Lalit Jain, Kevin Jamieson

In this paper, we model the problem of optimizing crowdfunding platforms, such as the non-profit Kiva or for-profit KickStarter, as a variant of the multi-armed bandit problem.

Massively Parallel Hyperparameter Tuning

no code implementations ICLR 2018 Lisha Li, Kevin Jamieson, Afshin Rostamizadeh, Katya Gonina, Moritz Hardt, Benjamin Recht, Ameet Talwalkar

Modern machine learning models are characterized by large hyperparameter search spaces and prohibitively expensive training costs.

A framework for Multi-A(rmed)/B(andit) testing with online FDR control

1 code implementation NeurIPS 2017 Fanny Yang, Aaditya Ramdas, Kevin Jamieson, Martin J. Wainwright

We propose an alternative framework to existing setups for controlling false alarms when multiple A/B tests are run over time.

Open Loop Hyperparameter Optimization and Determinantal Point Processes

no code implementations ICLR 2018 Jesse Dodge, Kevin Jamieson, Noah A. Smith

Driven by the need for parallelizable hyperparameter optimization methods, this paper studies \emph{open loop} search methods: sequences that are predetermined and can be generated before a single configuration is evaluated.

Hyperparameter Optimization Point Processes

The Simulator: Understanding Adaptive Sampling in the Moderate-Confidence Regime

no code implementations16 Feb 2017 Max Simchowitz, Kevin Jamieson, Benjamin Recht

Moreover, our lower bounds zero-in on the number of times each \emph{individual} arm needs to be pulled, uncovering new phenomena which are drowned out in the aggregate sample complexity.

Comparing Human-Centric and Robot-Centric Sampling for Robot Deep Learning from Demonstrations

no code implementations4 Oct 2016 Michael Laskey, Caleb Chuck, Jonathan Lee, Jeffrey Mahler, Sanjay Krishnan, Kevin Jamieson, Anca Dragan, Ken Goldberg

Although policies learned with RC sampling can be superior to HC sampling for standard learning models such as linear SVMs, policies learned with HC sampling may be comparable with highly-expressive learning models such as deep learning and hyper-parametric decision trees, which have little model error.

Finite Sample Prediction and Recovery Bounds for Ordinal Embedding

no code implementations NeurIPS 2016 Lalit Jain, Kevin Jamieson, Robert Nowak

First, we derive prediction error bounds for ordinal embedding with noise by exploiting the fact that the rank of a distance matrix of points in $\mathbb{R}^d$ is at most $d+2$.

On the Detection of Mixture Distributions with applications to the Most Biased Coin Problem

no code implementations25 Mar 2016 Kevin Jamieson, Daniel Haas, Ben Recht

As a result, these bounds have surprising implications both for solutions to the most biased coin problem and for anomaly detection when only partial information about the parameters is known.

Anomaly Detection

Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization

13 code implementations21 Mar 2016 Lisha Li, Kevin Jamieson, Giulia Desalvo, Afshin Rostamizadeh, Ameet Talwalkar

Performance of machine learning algorithms depends critically on identifying a good set of hyperparameters.

Hyperparameter Optimization

Best-of-K Bandits

no code implementations9 Mar 2016 Max Simchowitz, Kevin Jamieson, Benjamin Recht

This paper studies the Best-of-K Bandit game: At each time the player chooses a subset S among all N-choose-K possible options and observes reward max(X(i) : i in S) where X is a random vector drawn from a joint distribution.

Non-stochastic Best Arm Identification and Hyperparameter Optimization

1 code implementation27 Feb 2015 Kevin Jamieson, Ameet Talwalkar

Motivated by the task of hyperparameter optimization, we introduce the non-stochastic best-arm identification problem.

Hyperparameter Optimization

Sparse Dueling Bandits

no code implementations31 Jan 2015 Kevin Jamieson, Sumeet Katariya, Atul Deshpande, Robert Nowak

We prove that in the absence of structural assumptions, the sample complexity of this problem is proportional to the sum of the inverse squared gaps between the Borda scores of each suboptimal arm and the best arm.

lil' UCB : An Optimal Exploration Algorithm for Multi-Armed Bandits

no code implementations27 Dec 2013 Kevin Jamieson, Matthew Malloy, Robert Nowak, Sébastien Bubeck

The paper proposes a novel upper confidence bound (UCB) procedure for identifying the arm with the largest mean in a multi-armed bandit game in the fixed confidence setting using a small number of total samples.

Multi-Armed Bandits

On Finding the Largest Mean Among Many

no code implementations17 Jun 2013 Kevin Jamieson, Matthew Malloy, Robert Nowak, Sebastien Bubeck

Motivated by large-scale applications, we are especially interested in identifying situations where the total number of samples that are necessary and sufficient to find the best arm scale linearly with the number of arms.

Multi-Armed Bandits

Cannot find the paper you are looking for? You can Submit a new open access paper.