Search Results for author: Max Simchowitz

Found 34 papers, 4 papers with code

Logarithmic Regret for Online Control with Adversarial Noise

no code implementations ICML 2020 Dylan Foster, Max Simchowitz

We consider the problem of online control in a known linear dynamical system subject to adversarial noise.

Online Control of Unknown Time-Varying Dynamical Systems

no code implementations NeurIPS 2021 Edgar Minasyan, Paula Gradu, Max Simchowitz, Elad Hazan

On the positive side, we give an efficient algorithm that attains a sublinear regret bound against the class of Disturbance Response policies up to the aforementioned system variability term.

Stabilizing Dynamical Systems via Policy Gradient Methods

no code implementations NeurIPS 2021 Juan C. Perdomo, Jack Umenberger, Max Simchowitz

Stabilizing an unknown control system is one of the most fundamental problems in control systems engineering.

Policy Gradient Methods

Beyond No Regret: Instance-Dependent PAC Reinforcement Learning

no code implementations5 Aug 2021 Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

We show that this is not possible -- there exists a fundamental tradeoff between achieving low regret and identifying an $\epsilon$-optimal policy at the instance-optimal rate.

Bayesian decision-making under misspecified priors with applications to meta-learning

no code implementations NeurIPS 2021 Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy, Daniel Hsu, Thodoris Lykouris, Miroslav Dudík, Robert E. Schapire

We prove that the expected reward accrued by Thompson sampling (TS) with a misspecified prior differs by at most $\tilde{\mathcal{O}}(H^2 \epsilon)$ from TS with a well specified prior, where $\epsilon$ is the total-variation distance between priors and $H$ is the learning horizon.

Decision Making Meta-Learning +1

On the Stability of Nonlinear Receding Horizon Control: A Geometric Perspective

no code implementations27 Mar 2021 Tyler Westenbroek, Max Simchowitz, Michael I. Jordan, S. Shankar Sastry

The widespread adoption of nonlinear Receding Horizon Control (RHC) strategies by industry has led to more than 30 years of intense research efforts to provide stability guarantees for these methods.

Towards a Dimension-Free Understanding of Adaptive Linear Control

no code implementations19 Mar 2021 Juan C. Perdomo, Max Simchowitz, Alekh Agarwal, Peter Bartlett

We study the problem of adaptive control of the linear quadratic regulator for systems in very high, or even infinite dimension.

Exploration and Incentives in Reinforcement Learning

no code implementations28 Feb 2021 Max Simchowitz, Aleksandrs Slivkins

However, the algorithm controls the flow of information, and can incentivize the agents to explore via information asymmetry.

Task-Optimal Exploration in Linear Dynamical Systems

no code implementations10 Feb 2021 Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

Along the way, we establish that certainty equivalence decision making is instance- and task-optimal, and obtain the first algorithm for the linear quadratic regulator problem which is instance-optimal.

Decision Making

Learning the Linear Quadratic Regulator from Nonlinear Observations

no code implementations NeurIPS 2020 Zakaria Mhammedi, Dylan J. Foster, Max Simchowitz, Dipendra Misra, Wen Sun, Akshay Krishnamurthy, Alexander Rakhlin, John Langford

We introduce a new algorithm, RichID, which learns a near-optimal policy for the RichLQR with sample complexity scaling only with the dimension of the latent state space and the capacity of the decoder function class.

Continuous Control

Making Non-Stochastic Control (Almost) as Easy as Stochastic

no code implementations NeurIPS 2020 Max Simchowitz

Recent literature has made much progress in understanding \emph{online LQR}: a modern learning-theoretic take on the classical control problem in which a learner attempts to optimally control an unknown linear dynamical system with fully observed state, perturbed by i. i. d.

Balancing Competing Objectives with Noisy Data: Score-Based Classifiers for Welfare-Aware Machine Learning

1 code implementation ICML 2020 Esther Rolf, Max Simchowitz, Sarah Dean, Lydia T. Liu, Daniel Björkegren, Moritz Hardt, Joshua Blumenstock

Our theoretical results characterize the optimal strategies in this class, bound the Pareto errors due to inaccuracies in the scores, and show an equivalence between optimal strategies and a rich class of fairness-constrained profit-maximizing policies.

Fairness

Logarithmic Regret for Adversarial Online Control

no code implementations29 Feb 2020 Dylan J. Foster, Max Simchowitz

We introduce a new algorithm for online linear-quadratic control in a known system subject to adversarial disturbances.

Reward-Free Exploration for Reinforcement Learning

no code implementations ICML 2020 Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu

We give an efficient algorithm that conducts $\tilde{\mathcal{O}}(S^2A\mathrm{poly}(H)/\epsilon^2)$ episodes of exploration and returns $\epsilon$-suboptimal policies for an arbitrary number of reward functions.

Naive Exploration is Optimal for Online LQR

no code implementations ICML 2020 Max Simchowitz, Dylan J. Foster

Our upper bound is attained by a simple variant of $\textit{{certainty equivalent control}}$, where the learner selects control inputs according to the optimal controller for their estimate of the system while injecting exploratory random noise.

Improper Learning for Non-Stochastic Control

no code implementations25 Jan 2020 Max Simchowitz, Karan Singh, Elad Hazan

We consider the problem of controlling a possibly unknown linear dynamical system with adversarial perturbations, adversarially chosen convex loss functions, and partially observed states, known as non-stochastic control.

Corruption-robust exploration in episodic reinforcement learning

no code implementations20 Nov 2019 Thodoris Lykouris, Max Simchowitz, Aleksandrs Slivkins, Wen Sun

We initiate the study of multi-stage episodic reinforcement learning under adversarial corruptions in both the rewards and the transition probabilities of the underlying system extending recent results for the special case of stochastic bandits.

Multi-Armed Bandits

The gradient complexity of linear regression

no code implementations6 Nov 2019 Mark Braverman, Elad Hazan, Max Simchowitz, Blake Woodworth

We investigate the computational complexity of several basic linear algebra primitives, including largest eigenvector computation and linear regression, in the computational model that allows access to the data via a matrix-vector product oracle.

Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs

no code implementations NeurIPS 2019 Max Simchowitz, Kevin Jamieson

This paper establishes that optimistic algorithms attain gap-dependent and non-asymptotic logarithmic regret for episodic MDPs.

Learning Linear Dynamical Systems with Semi-Parametric Least Squares

1 code implementation2 Feb 2019 Max Simchowitz, Ross Boczar, Benjamin Recht

We analyze a simple prefiltered variation of the least squares estimator for the problem of estimation with biased, semi-parametric noise, an error model studied more broadly in causal statistics and active learning.

Active Learning

A Successive-Elimination Approach to Adaptive Robotic Sensing

no code implementations27 Sep 2018 Esther Rolf, David Fridovich-Keil, Max Simchowitz, Benjamin Recht, Claire Tomlin

We study an adaptive source seeking problem, in which a mobile robot must identify the strongest emitter(s) of a signal in an environment with background emissions.

Motion Capture Trajectory Planning

The implicit fairness criterion of unconstrained learning

no code implementations29 Aug 2018 Lydia T. Liu, Max Simchowitz, Moritz Hardt

We show that under reasonable conditions, the deviation from satisfying group calibration is upper bounded by the excess risk of the learned score relative to the Bayes optimal score function.

Fairness

Adaptive Sampling for Convex Regression

no code implementations14 Aug 2018 Max Simchowitz, Kevin Jamieson, Jordan W. Suchow, Thomas L. Griffiths

In this paper, we introduce the first principled adaptive-sampling procedure for learning a convex function in the $L_\infty$ norm, a problem that arises often in the behavioral and social sciences.

On the Randomized Complexity of Minimizing a Convex Quadratic Function

no code implementations24 Jul 2018 Max Simchowitz

Minimizing a convex, quadratic objective of the form $f_{\mathbf{A},\mathbf{b}}(x) := \frac{1}{2}x^\top \mathbf{A} x - \langle \mathbf{b}, x \rangle$ for $\mathbf{A} \succ 0 $ is a fundamental problem in machine learning and optimization.

Tight Query Complexity Lower Bounds for PCA via Finite Sample Deformed Wigner Law

no code implementations4 Apr 2018 Max Simchowitz, Ahmed El Alaoui, Benjamin Recht

We show that for every $\mathtt{gap} \in (0, 1/2]$, there exists a distribution over matrices $\mathbf{M}$ for which 1) $\mathrm{gap}_r(\mathbf{M}) = \Omega(\mathtt{gap})$ (where $\mathrm{gap}_r(\mathbf{M})$ is the normalized gap between the $r$ and $r+1$-st largest-magnitude eigenvector of $\mathbf{M}$), and 2) any algorithm $\mathsf{Alg}$ which takes fewer than $\mathrm{const} \times \frac{r \log d}{\sqrt{\mathtt{gap}}}$ queries fails (with overwhelming probability) to identity a matrix $\widehat{\mathsf{V}} \in \mathbb{R}^{d \times r}$ with orthonormal columns for which $\langle \widehat{\mathsf{V}}, \mathbf{M} \widehat{\mathsf{V}}\rangle \ge (1 - \mathrm{const} \times \mathtt{gap})\sum_{i=1}^r \lambda_i(\mathbf{M})$.

Delayed Impact of Fair Machine Learning

2 code implementations ICML 2018 Lydia T. Liu, Sarah Dean, Esther Rolf, Max Simchowitz, Moritz Hardt

Fairness in machine learning has predominantly been studied in static classification settings without concern for how decisions change the underlying population over time.

Fairness

Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification

no code implementations22 Feb 2018 Max Simchowitz, Horia Mania, Stephen Tu, Michael. I. Jordan, Benjamin Recht

We prove that the ordinary least-squares (OLS) estimator attains nearly minimax optimal performance for the identification of linear dynamical systems from a single observed trajectory.

Time Series

Approximate Ranking from Pairwise Comparisons

no code implementations4 Jan 2018 Reinhard Heckel, Max Simchowitz, Kannan Ramchandran, Martin J. Wainwright

Accordingly, we study the problem of finding approximate rankings from pairwise comparisons.

First-order Methods Almost Always Avoid Saddle Points

no code implementations20 Oct 2017 Jason D. Lee, Ioannis Panageas, Georgios Piliouras, Max Simchowitz, Michael. I. Jordan, Benjamin Recht

We establish that first-order methods avoid saddle points for almost all initializations.

The Simulator: Understanding Adaptive Sampling in the Moderate-Confidence Regime

no code implementations16 Feb 2017 Max Simchowitz, Kevin Jamieson, Benjamin Recht

Moreover, our lower bounds zero-in on the number of times each \emph{individual} arm needs to be pulled, uncovering new phenomena which are drowned out in the aggregate sample complexity.

Best-of-K Bandits

no code implementations9 Mar 2016 Max Simchowitz, Kevin Jamieson, Benjamin Recht

This paper studies the Best-of-K Bandit game: At each time the player chooses a subset S among all N-choose-K possible options and observes reward max(X(i) : i in S) where X is a random vector drawn from a joint distribution.

Gradient Descent Converges to Minimizers

no code implementations16 Feb 2016 Jason D. Lee, Max Simchowitz, Michael. I. Jordan, Benjamin Recht

We show that gradient descent converges to a local minimizer, almost surely with random initialization.

Cannot find the paper you are looking for? You can Submit a new open access paper.