Search Results for author: Andrew Wagenmaker

Found 13 papers, 1 papers with code

ASID: Active Exploration for System Identification in Robotic Manipulation

no code implementations18 Apr 2024 Marius Memmel, Andrew Wagenmaker, Chuning Zhu, Patrick Yin, Dieter Fox, Abhishek Gupta

In this work, we propose a learning system that can leverage a small amount of real-world data to autonomously refine a simulation model and then plan an accurate control strategy that can be deployed in the real world.

Fair Active Learning in Low-Data Regimes

no code implementations13 Dec 2023 Romain Camilleri, Andrew Wagenmaker, Jamie Morgenstern, Lalit Jain, Kevin Jamieson

In this work, we address the challenges of reducing bias and improving accuracy in data-scarce environments, where the cost of collecting labeled data prohibits the use of large, labeled datasets.

Active Learning Fairness

Instance-Optimality in Interactive Decision Making: Toward a Non-Asymptotic Theory

no code implementations24 Apr 2023 Andrew Wagenmaker, Dylan J. Foster

We consider the development of adaptive, instance-dependent algorithms for interactive decision making (bandits, reinforcement learning, and beyond) that, rather than only performing well in the worst case, adapt to favorable properties of real-world instances for improved performance.

Decision Making reinforcement-learning

Leveraging Offline Data in Online Reinforcement Learning

no code implementations9 Nov 2022 Andrew Wagenmaker, Aldo Pacchiano

Practical scenarios often motivate an intermediate setting: if we have some set of offline data and, in addition, may also interact with the environment, how can we best use the offline data to minimize the number of online interactions necessary to learn an $\epsilon$-optimal policy?

Offline RL reinforcement-learning +1

Instance-Dependent Near-Optimal Policy Identification in Linear MDPs via Online Experiment Design

no code implementations6 Jul 2022 Andrew Wagenmaker, Kevin Jamieson

While much progress has been made in understanding the minimax sample complexity of reinforcement learning (RL) -- the complexity of learning on the "worst-case" instance -- such measures of complexity often do not capture the true difficulty of learning.

Reinforcement Learning (RL)

Active Learning with Safety Constraints

no code implementations22 Jun 2022 Romain Camilleri, Andrew Wagenmaker, Jamie Morgenstern, Lalit Jain, Kevin Jamieson

To our knowledge, our results are the first on best-arm identification in linear bandits with safety constraints.

Active Learning Decision Making +1

Reward-Free RL is No Harder Than Reward-Aware RL in Linear Markov Decision Processes

no code implementations26 Jan 2022 Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson

We first develop a computationally efficient algorithm for reward-free RL in a $d$-dimensional linear MDP with sample complexity scaling as $\widetilde{\mathcal{O}}(d^2 H^5/\epsilon^2)$.

Reinforcement Learning (RL)

First-Order Regret in Reinforcement Learning with Linear Function Approximation: A Robust Estimation Approach

no code implementations7 Dec 2021 Andrew Wagenmaker, Yifang Chen, Max Simchowitz, Simon S. Du, Kevin Jamieson

Obtaining first-order regret bounds -- regret bounds scaling not as the worst-case but with some measure of the performance of the optimal policy on a given instance -- is a core question in sequential decision-making.

Decision Making reinforcement-learning +1

Best Arm Identification with Safety Constraints

1 code implementation23 Nov 2021 Zhenlin Wang, Andrew Wagenmaker, Kevin Jamieson

The best arm identification problem in the multi-armed bandit setting is an excellent model of many real-world decision-making problems, yet it fails to capture the fact that in the real-world, safety constraints often must be met while learning.

Decision Making

Beyond No Regret: Instance-Dependent PAC Reinforcement Learning

no code implementations5 Aug 2021 Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

We show this is not possible -- there exists a fundamental tradeoff between achieving low regret and identifying an $\epsilon$-optimal policy at the instance-optimal rate.

reinforcement-learning Reinforcement Learning (RL)

Task-Optimal Exploration in Linear Dynamical Systems

no code implementations10 Feb 2021 Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

Along the way, we establish that certainty equivalence decision making is instance- and task-optimal, and obtain the first algorithm for the linear quadratic regulator problem which is instance-optimal.

Decision Making

Experimental Design for Regret Minimization in Linear Bandits

no code implementations1 Nov 2020 Andrew Wagenmaker, Julian Katz-Samuels, Kevin Jamieson

In this paper we propose a novel experimental design-based algorithm to minimize regret in online stochastic linear and combinatorial bandits.

Experimental Design

Active Learning for Identification of Linear Dynamical Systems

no code implementations2 Feb 2020 Andrew Wagenmaker, Kevin Jamieson

We propose an algorithm to actively estimate the parameters of a linear dynamical system.

Active Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.