Search Results for author: Stephen Tu

Found 40 papers, 8 papers with code

Sharp Rates in Dependent Learning Theory: Avoiding Sample Size Deflation for the Square Loss

no code implementations8 Feb 2024 Ingvar Ziemann, Stephen Tu, George J. Pappas, Nikolai Matni

We show that whenever the topologies of $L^2$ and $\Psi_p$ are comparable on our hypothesis class $\mathscr{F}$ -- that is, $\mathscr{F}$ is a weakly sub-Gaussian class: $\|f\|_{\Psi_p} \lesssim \|f\|_{L^2}^\eta$ for some $\eta\in (0, 1]$ -- the empirical risk minimizer achieves a rate that only depends on the complexity of the class and second order statistics in its leading term.

Learning Theory

Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners

no code implementations4 Jul 2023 Allen Z. Ren, Anushri Dixit, Alexandra Bodrova, Sumeet Singh, Stephen Tu, Noah Brown, Peng Xu, Leila Takayama, Fei Xia, Jake Varley, Zhenjia Xu, Dorsa Sadigh, Andy Zeng, Anirudha Majumdar

Large language models (LLMs) exhibit a wide range of promising capabilities -- from step-by-step planning to commonsense reasoning -- that may provide utility for robots, but remain prone to confidently hallucinated predictions.

Conformal Prediction Language Modelling +1

Bootstrapped Representations in Reinforcement Learning

no code implementations16 Jun 2023 Charline Le Lan, Stephen Tu, Mark Rowland, Anna Harutyunyan, Rishabh Agarwal, Marc G. Bellemare, Will Dabney

In this paper, we address this gap and provide a theoretical characterization of the state representation learnt by temporal difference learning (Sutton, 1988).

Auxiliary Learning reinforcement-learning +1

Safely Learning Dynamical Systems

no code implementations20 May 2023 Amir Ali Ahmadi, Abraar Chaudhry, Vikas Sindhwani, Stephen Tu

For $T=2$, we give a semidefinite representation of the set of safe initial conditions and show that $\lceil n/2 \rceil$ trajectories generically suffice for safe learning.

The Power of Learned Locally Linear Models for Nonlinear Policy Optimization

no code implementations16 May 2023 Daniel Pfrommer, Max Simchowitz, Tyler Westenbroek, Nikolai Matni, Stephen Tu

A common pipeline in learning-based control is to iteratively estimate a model of system dynamics, and apply a trajectory optimization algorithm - e. g.~$\mathtt{iLQR}$ - on the learned model to minimize a target cost.

Multi-Task Imitation Learning for Linear Dynamical Systems

no code implementations1 Dec 2022 Thomas T. Zhang, Katie Kang, Bruce D. Lee, Claire Tomlin, Sergey Levine, Stephen Tu, Nikolai Matni

In particular, we consider a setting where learning is split into two phases: (a) a pre-training step where a shared $k$-dimensional representation is learned from $H$ source policies, and (b) a target policy fine-tuning step where the learned representation is used to parameterize the policy class.

Imitation Learning Representation Learning

Visual Backtracking Teleoperation: A Data Collection Protocol for Offline Image-Based Reinforcement Learning

no code implementations5 Oct 2022 David Brandfonbrener, Stephen Tu, Avi Singh, Stefan Welker, Chad Boodoo, Nikolai Matni, Jake Varley

We find that by adjusting the data collection process we improve the quality of both the learned value functions and policies over a variety of baseline methods for data collection.

Continuous Control Reinforcement Learning (RL)

Learning with little mixing

1 code implementation16 Jun 2022 Ingvar Ziemann, Stephen Tu

We study square loss in a realizable time-series framework with martingale difference noise.

Time Series Time Series Analysis

TaSIL: Taylor Series Imitation Learning

1 code implementation30 May 2022 Daniel Pfrommer, Thomas T. C. K. Zhang, Stephen Tu, Nikolai Matni

We propose Taylor Series Imitation Learning (TaSIL), a simple augmentation to standard behavior cloning losses in the context of continuous control.

Continuous Control Imitation Learning

Learning from many trajectories

no code implementations31 Mar 2022 Stephen Tu, Roy Frostig, Mahdi Soltanolkotabi

Specifically, we establish that the worst-case error rate of this problem is $\Theta(n / m T)$ whenever $m \gtrsim n$.

Learning Theory

On the Generalization of Representations in Reinforcement Learning

1 code implementation1 Mar 2022 Charline Le Lan, Stephen Tu, Adam Oberman, Rishabh Agarwal, Marc G. Bellemare

We complement our theoretical results with an empirical survey of classic representation learning methods from the literature and results on the Arcade Learning Environment, and find that the generalization behaviour of learned representations is well-explained by their effective dimension.

Atari Games reinforcement-learning +2

Adversarially Robust Stability Certificates can be Sample-Efficient

no code implementations20 Dec 2021 Thomas T. C. K. Zhang, Stephen Tu, Nicholas M. Boffi, Jean-Jacques E. Slotine, Nikolai Matni

Motivated by bridging the simulation to reality gap in the context of safety-critical systems, we consider learning adversarially robust stability certificates for unknown nonlinear dynamical systems.

Learning Robust Output Control Barrier Functions from Safe Expert Demonstrations

1 code implementation18 Nov 2021 Lars Lindemann, Alexander Robey, Lejun Jiang, Satyajeet Das, Stephen Tu, Nikolai Matni

Along with the optimization problem, we provide verifiable conditions in terms of the density of the data, smoothness of the system model and state estimator, and the size of the error bounds that guarantee validity of the obtained ROCBF.

Autonomous Driving

Nonparametric adaptive control and prediction: theory and randomized algorithms

no code implementations7 Jun 2021 Nicholas M. Boffi, Stephen Tu, Jean-Jacques E. Slotine

A key assumption in the theory of nonlinear adaptive control is that the uncertainty of the system can be expressed in the linear span of a set of known basis functions.

On the Sample Complexity of Stability Constrained Imitation Learning

no code implementations18 Feb 2021 Stephen Tu, Alexander Robey, Tingnan Zhang, Nikolai Matni

We study the following question in the context of imitation learning for continuous control: how are the underlying stability properties of an expert policy reflected in the sample-complexity of an imitation learning task?

Continuous Control Generalization Bounds +1

Learning Robust Hybrid Control Barrier Functions for Uncertain Systems

1 code implementation16 Jan 2021 Alexander Robey, Lars Lindemann, Stephen Tu, Nikolai Matni

We identify sufficient conditions on the data such that feasibility of the optimization problem ensures correctness of the learned robust hybrid control barrier functions.

Regret Bounds for Adaptive Nonlinear Control

no code implementations26 Nov 2020 Nicholas M. Boffi, Stephen Tu, Jean-Jacques E. Slotine

We study the problem of adaptively controlling a known discrete-time nonlinear system subject to unmodeled disturbances.

Safely Learning Dynamical Systems from Short Trajectories

no code implementations24 Nov 2020 Amir Ali Ahmadi, Abraar Chaudhry, Vikas Sindhwani, Stephen Tu

For our first two results, we consider the setting of safely learning linear dynamics.

Learning Hybrid Control Barrier Functions from Data

no code implementations8 Nov 2020 Lars Lindemann, Haimin Hu, Alexander Robey, Hanwen Zhang, Dimos V. Dimarogonas, Stephen Tu, Nikolai Matni

Motivated by the lack of systematic tools to obtain safe control laws for hybrid systems, we propose an optimization-based framework for learning certifiably safe control laws from data.

Learning Stability Certificates from Data

no code implementations13 Aug 2020 Nicholas M. Boffi, Stephen Tu, Nikolai Matni, Jean-Jacques E. Slotine, Vikas Sindhwani

Many existing tools in nonlinear control theory for establishing stability or safety of a dynamical system can be distilled to the construction of a certificate function that guarantees a desired property.

The role of optimization geometry in single neuron learning

no code implementations15 Jun 2020 Nicholas M. Boffi, Stephen Tu, Jean-Jacques E. Slotine

Recent numerical experiments have demonstrated that the choice of optimization geometry used during training can impact generalization performance when learning expressive nonlinear model classes such as deep neural networks.

Learning Control Barrier Functions from Expert Demonstrations

1 code implementation7 Apr 2020 Alexander Robey, Haimin Hu, Lars Lindemann, Hanwen Zhang, Dimos V. Dimarogonas, Stephen Tu, Nikolai Matni

Furthermore, if the CBF parameterization is convex, then under mild assumptions, so is our learning process.

Observational Overfitting in Reinforcement Learning

no code implementations ICLR 2020 Xingyou Song, Yiding Jiang, Stephen Tu, Yilun Du, Behnam Neyshabur

A major component of overfitting in model-free reinforcement learning (RL) involves the case where the agent may mistakenly correlate reward with certain spurious features from the observations generated by the Markov Decision Process (MDP).

reinforcement-learning Reinforcement Learning (RL)

A Tutorial on Concentration Bounds for System Identification

no code implementations27 Jun 2019 Nikolai Matni, Stephen Tu

We provide a brief tutorial on the use of concentration inequalities as they apply to system identification of state-space parameters of linear time invariant systems, with a focus on the fully observed setting.

From self-tuning regulators to reinforcement learning and back again

no code implementations27 Jun 2019 Nikolai Matni, Alexandre Proutiere, Anders Rantzer, Stephen Tu

Machine and reinforcement learning (RL) are increasingly being applied to plan and control the behavior of autonomous systems interacting with the physical world.

reinforcement-learning Reinforcement Learning (RL)

Finite-time Analysis of Approximate Policy Iteration for the Linear Quadratic Regulator

no code implementations NeurIPS 2019 Karl Krauth, Stephen Tu, Benjamin Recht

We study the sample complexity of approximate policy iteration (PI) for the Linear Quadratic Regulator (LQR), building on a recent line of work using LQR as a testbed to understand the limits of reinforcement learning (RL) algorithms on continuous control tasks.

Continuous Control Reinforcement Learning (RL)

Certainty Equivalence is Efficient for Linear Quadratic Control

no code implementations NeurIPS 2019 Horia Mania, Stephen Tu, Benjamin Recht

We show that for both the fully and partially observed settings, the sub-optimality gap between the cost incurred by playing the certainty equivalent controller on the true system and the cost incurred by using the optimal LQ controller enjoys a fast statistical rate, scaling as the square of the parameter error.

Minimax Lower Bounds for $\mathcal{H}_\infty$-Norm Estimation

no code implementations28 Sep 2018 Stephen Tu, Ross Boczar, Benjamin Recht

The problem of estimating the $\mathcal{H}_\infty$-norm of an LTI system from noisy input/output measurements has attracted recent attention as an alternative to parameter identification for bounding unmodeled dynamics in robust control.

Safely Learning to Control the Constrained Linear Quadratic Regulator

2 code implementations26 Sep 2018 Sarah Dean, Stephen Tu, Nikolai Matni, Benjamin Recht

We study the constrained linear quadratic regulator with unknown dynamics, addressing the tension between safety and exploration in data-driven control techniques.

Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator

no code implementations NeurIPS 2018 Sarah Dean, Horia Mania, Nikolai Matni, Benjamin Recht, Stephen Tu

We consider adaptive control of the Linear Quadratic Regulator (LQR), where an unknown linear system is controlled subject to quadratic costs.

Learning Contracting Vector Fields For Stable Imitation Learning

no code implementations13 Apr 2018 Vikas Sindhwani, Stephen Tu, Mohi Khansari

We propose a new non-parametric framework for learning incrementally stable dynamical systems x' = f(x) from a set of sampled trajectories.

Imitation Learning

Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification

no code implementations22 Feb 2018 Max Simchowitz, Horia Mania, Stephen Tu, Michael. I. Jordan, Benjamin Recht

We prove that the ordinary least-squares (OLS) estimator attains nearly minimax optimal performance for the identification of linear dynamical systems from a single observed trajectory.

Time Series Time Series Analysis

On the Sample Complexity of the Linear Quadratic Regulator

no code implementations4 Oct 2017 Sarah Dean, Horia Mania, Nikolai Matni, Benjamin Recht, Stephen Tu

This paper addresses the optimal control problem known as the Linear Quadratic Regulator in the case when the dynamics are unknown.

Non-Asymptotic Analysis of Robust Control from Coarse-Grained Identification

no code implementations15 Jul 2017 Stephen Tu, Ross Boczar, Andrew Packard, Benjamin Recht

We derive bounds on the number of noisy input/output samples from a stable linear time-invariant system that are sufficient to guarantee that the corresponding finite impulse response approximation is close to the true system in the $\mathcal{H}_\infty$-norm.

Large Scale Kernel Learning using Block Coordinate Descent

no code implementations17 Feb 2016 Stephen Tu, Rebecca Roelofs, Shivaram Venkataraman, Benjamin Recht

We demonstrate that distributed block coordinate descent can quickly solve kernel regression and classification problems with millions of data points.

Classification General Classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.