Search Results for author: Stephen Tu

Found 40 papers, 8 papers with code

Sharp Rates in Dependent Learning Theory: Avoiding Sample Size Deflation for the Square Loss

no code implementations • 8 Feb 2024 • Ingvar Ziemann, Stephen Tu, George J. Pappas, Nikolai Matni

We show that whenever the topologies of $L^2$ and $\Psi_p$ are comparable on our hypothesis class $\mathscr{F}$ -- that is, $\mathscr{F}$ is a weakly sub-Gaussian class: $\|f\|_{\Psi_p} \lesssim \|f\|_{L^2}^\eta$ for some $\eta\in (0, 1]$ -- the empirical risk minimizer achieves a rate that only depends on the complexity of the class and second order statistics in its leading term.

Learning Theory

Paper
Add Code

Revisiting Energy Based Models as Policies: Ranking Noise Contrastive Estimation and Interpolating Energy Models

no code implementations • 11 Sep 2023 • Sumeet Singh, Stephen Tu, Vikas Sindhwani

In this work, we revisit the choice of energy-based models (EBM) as a policy class.

Paper
Add Code

Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners

no code implementations • 4 Jul 2023 • Allen Z. Ren, Anushri Dixit, Alexandra Bodrova, Sumeet Singh, Stephen Tu, Noah Brown, Peng Xu, Leila Takayama, Fei Xia, Jake Varley, Zhenjia Xu, Dorsa Sadigh, Andy Zeng, Anirudha Majumdar

Large language models (LLMs) exhibit a wide range of promising capabilities -- from step-by-step planning to commonsense reasoning -- that may provide utility for robots, but remain prone to confidently hallucinated predictions.

Conformal Prediction Language Modelling +1

Paper
Add Code

Bootstrapped Representations in Reinforcement Learning

no code implementations • 16 Jun 2023 • Charline Le Lan, Stephen Tu, Mark Rowland, Anna Harutyunyan, Rishabh Agarwal, Marc G. Bellemare, Will Dabney

In this paper, we address this gap and provide a theoretical characterization of the state representation learnt by temporal difference learning (Sutton, 1988).

Auxiliary Learning reinforcement-learning +1

Paper
Add Code

Safely Learning Dynamical Systems

no code implementations • 20 May 2023 • Amir Ali Ahmadi, Abraar Chaudhry, Vikas Sindhwani, Stephen Tu

For $T=2$, we give a semidefinite representation of the set of safe initial conditions and show that $\lceil n/2 \rceil$ trajectories generically suffice for safe learning.

Paper
Add Code

The Power of Learned Locally Linear Models for Nonlinear Policy Optimization

no code implementations • 16 May 2023 • Daniel Pfrommer, Max Simchowitz, Tyler Westenbroek, Nikolai Matni, Stephen Tu

A common pipeline in learning-based control is to iteratively estimate a model of system dynamics, and apply a trajectory optimization algorithm - e. g.~$\mathtt{iLQR}$ - on the learned model to minimize a target cost.

Paper
Add Code

Multi-Task Imitation Learning for Linear Dynamical Systems

no code implementations • 1 Dec 2022 • Thomas T. Zhang, Katie Kang, Bruce D. Lee, Claire Tomlin, Sergey Levine, Stephen Tu, Nikolai Matni

In particular, we consider a setting where learning is split into two phases: (a) a pre-training step where a shared $k$-dimensional representation is learned from $H$ source policies, and (b) a target policy fine-tuning step where the learned representation is used to parameterize the policy class.

Imitation Learning Representation Learning

Paper
Add Code

Visual Backtracking Teleoperation: A Data Collection Protocol for Offline Image-Based Reinforcement Learning

no code implementations • 5 Oct 2022 • David Brandfonbrener, Stephen Tu, Avi Singh, Stefan Welker, Chad Boodoo, Nikolai Matni, Jake Varley

We find that by adjusting the data collection process we improve the quality of both the learned value functions and policies over a variety of baseline methods for data collection.

Continuous Control Reinforcement Learning (RL)

Paper
Add Code

Learning Model Predictive Controllers with Real-Time Attention for Real-World Navigation

no code implementations • 22 Sep 2022 • Xuesu Xiao, Tingnan Zhang, Krzysztof Choromanski, Edward Lee, Anthony Francis, Jake Varley, Stephen Tu, Sumeet Singh, Peng Xu, Fei Xia, Sven Mikael Persson, Dmitry Kalashnikov, Leila Takayama, Roy Frostig, Jie Tan, Carolina Parada, Vikas Sindhwani

Despite decades of research, existing navigation systems still face real-world challenges when deployed in the wild, e. g., in cluttered home environments or in human-occupied public spaces.

Imitation Learning Model Predictive Control

Paper
Add Code

Learning with little mixing

1 code implementation • 16 Jun 2022 • Ingvar Ziemann, Stephen Tu

We study square loss in a realizable time-series framework with martingale difference noise.

Time Series Time Series Analysis

32,732

Paper
Code

TaSIL: Taylor Series Imitation Learning

1 code implementation • 30 May 2022 • Daniel Pfrommer, Thomas T. C. K. Zhang, Stephen Tu, Nikolai Matni

We propose Taylor Series Imitation Learning (TaSIL), a simple augmentation to standard behavior cloning losses in the context of continuous control.

Continuous Control Imitation Learning

Paper
Code

Learning from many trajectories

no code implementations • 31 Mar 2022 • Stephen Tu, Roy Frostig, Mahdi Soltanolkotabi

Specifically, we establish that the worst-case error rate of this problem is $\Theta(n / m T)$ whenever $m \gtrsim n$.

Learning Theory

Paper
Add Code

On the Generalization of Representations in Reinforcement Learning

1 code implementation • 1 Mar 2022 • Charline Le Lan, Stephen Tu, Adam Oberman, Rishabh Agarwal, Marc G. Bellemare

We complement our theoretical results with an empirical survey of classic representation learning methods from the literature and results on the Arcade Learning Environment, and find that the generalization behaviour of learned representations is well-explained by their effective dimension.

Atari Games reinforcement-learning +2

32,735

Paper
Code

Adversarially Robust Stability Certificates can be Sample-Efficient

no code implementations • 20 Dec 2021 • Thomas T. C. K. Zhang, Stephen Tu, Nicholas M. Boffi, Jean-Jacques E. Slotine, Nikolai Matni

Motivated by bridging the simulation to reality gap in the context of safety-critical systems, we consider learning adversarially robust stability certificates for unknown nonlinear dynamical systems.

Paper
Add Code

Learning Robust Output Control Barrier Functions from Safe Expert Demonstrations

1 code implementation • 18 Nov 2021 • Lars Lindemann, Alexander Robey, Lejun Jiang, Satyajeet Das, Stephen Tu, Nikolai Matni

Along with the optimization problem, we provide verifiable conditions in terms of the density of the data, smoothness of the system model and state estimator, and the size of the error bounds that guarantee validity of the obtained ROCBF.

Autonomous Driving

Paper
Code

Nonparametric adaptive control and prediction: theory and randomized algorithms

no code implementations • 7 Jun 2021 • Nicholas M. Boffi, Stephen Tu, Jean-Jacques E. Slotine

A key assumption in the theory of nonlinear adaptive control is that the uncertainty of the system can be expressed in the linear span of a set of known basis functions.

Paper
Add Code

On the Sample Complexity of Stability Constrained Imitation Learning

no code implementations • 18 Feb 2021 • Stephen Tu, Alexander Robey, Tingnan Zhang, Nikolai Matni

We study the following question in the context of imitation learning for continuous control: how are the underlying stability properties of an expert policy reflected in the sample-complexity of an imitation learning task?

Continuous Control Generalization Bounds +1

Paper
Add Code

Learning Robust Hybrid Control Barrier Functions for Uncertain Systems

1 code implementation • 16 Jan 2021 • Alexander Robey, Lars Lindemann, Stephen Tu, Nikolai Matni

We identify sufficient conditions on the data such that feasibility of the optimization problem ensures correctness of the learned robust hybrid control barrier functions.

Paper
Code

Regret Bounds for Adaptive Nonlinear Control

no code implementations • 26 Nov 2020 • Nicholas M. Boffi, Stephen Tu, Jean-Jacques E. Slotine

We study the problem of adaptively controlling a known discrete-time nonlinear system subject to unmodeled disturbances.

Paper
Add Code

Safely Learning Dynamical Systems from Short Trajectories

no code implementations • 24 Nov 2020 • Amir Ali Ahmadi, Abraar Chaudhry, Vikas Sindhwani, Stephen Tu

For our first two results, we consider the setting of safely learning linear dynamics.

Paper
Add Code

Learning Hybrid Control Barrier Functions from Data

no code implementations • 8 Nov 2020 • Lars Lindemann, Haimin Hu, Alexander Robey, Hanwen Zhang, Dimos V. Dimarogonas, Stephen Tu, Nikolai Matni

Motivated by the lack of systematic tools to obtain safe control laws for hybrid systems, we propose an optimization-based framework for learning certifiably safe control laws from data.

Paper
Add Code

Learning Stability Certificates from Data

no code implementations • 13 Aug 2020 • Nicholas M. Boffi, Stephen Tu, Nikolai Matni, Jean-Jacques E. Slotine, Vikas Sindhwani

Many existing tools in nonlinear control theory for establishing stability or safety of a dynamical system can be distilled to the construction of a certificate function that guarantees a desired property.

Paper
Add Code

The role of optimization geometry in single neuron learning

no code implementations • 15 Jun 2020 • Nicholas M. Boffi, Stephen Tu, Jean-Jacques E. Slotine

Recent numerical experiments have demonstrated that the choice of optimization geometry used during training can impact generalization performance when learning expressive nonlinear model classes such as deep neural networks.

Paper
Add Code

Learning Control Barrier Functions from Expert Demonstrations

1 code implementation • 7 Apr 2020 • Alexander Robey, Haimin Hu, Lars Lindemann, Hanwen Zhang, Dimos V. Dimarogonas, Stephen Tu, Nikolai Matni

Furthermore, if the CBF parameterization is convex, then under mild assumptions, so is our learning process.

Paper
Code

Observational Overfitting in Reinforcement Learning

no code implementations • ICLR 2020 • Xingyou Song, Yiding Jiang, Stephen Tu, Yilun Du, Behnam Neyshabur

A major component of overfitting in model-free reinforcement learning (RL) involves the case where the agent may mistakenly correlate reward with certain spurious features from the observations generated by the Markov Decision Process (MDP).

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

A Tutorial on Concentration Bounds for System Identification

no code implementations • 27 Jun 2019 • Nikolai Matni, Stephen Tu

We provide a brief tutorial on the use of concentration inequalities as they apply to system identification of state-space parameters of linear time invariant systems, with a focus on the fully observed setting.

Paper
Add Code

From self-tuning regulators to reinforcement learning and back again

no code implementations • 27 Jun 2019 • Nikolai Matni, Alexandre Proutiere, Anders Rantzer, Stephen Tu

Machine and reinforcement learning (RL) are increasingly being applied to plan and control the behavior of autonomous systems interacting with the physical world.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Finite-time Analysis of Approximate Policy Iteration for the Linear Quadratic Regulator

no code implementations • NeurIPS 2019 • Karl Krauth, Stephen Tu, Benjamin Recht

We study the sample complexity of approximate policy iteration (PI) for the Linear Quadratic Regulator (LQR), building on a recent line of work using LQR as a testbed to understand the limits of reinforcement learning (RL) algorithms on continuous control tasks.

Continuous Control Reinforcement Learning (RL)

Paper
Add Code

Certainty Equivalence is Efficient for Linear Quadratic Control

no code implementations • NeurIPS 2019 • Horia Mania, Stephen Tu, Benjamin Recht

We show that for both the fully and partially observed settings, the sub-optimality gap between the cost incurred by playing the certainty equivalent controller on the true system and the cost incurred by using the optimal LQ controller enjoys a fast statistical rate, scaling as the square of the parameter error.

Paper
Add Code

The Gap Between Model-Based and Model-Free Methods on the Linear Quadratic Regulator: An Asymptotic Viewpoint

no code implementations • 9 Dec 2018 • Stephen Tu, Benjamin Recht

The effectiveness of model-based versus model-free methods is a long-standing question in reinforcement learning (RL).

Continuous Control Reinforcement Learning (RL)

Paper
Add Code

Minimax Lower Bounds for $\mathcal{H}_\infty$-Norm Estimation

no code implementations • 28 Sep 2018 • Stephen Tu, Ross Boczar, Benjamin Recht

The problem of estimating the $\mathcal{H}_\infty$-norm of an LTI system from noisy input/output measurements has attracted recent attention as an alternative to parameter identification for bounding unmodeled dynamics in robust control.

Paper
Add Code

Safely Learning to Control the Constrained Linear Quadratic Regulator

2 code implementations • 26 Sep 2018 • Sarah Dean, Stephen Tu, Nikolai Matni, Benjamin Recht

We study the constrained linear quadratic regulator with unknown dynamics, addressing the tension between safety and exploration in data-driven control techniques.

Paper
Code

Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator

no code implementations • NeurIPS 2018 • Sarah Dean, Horia Mania, Nikolai Matni, Benjamin Recht, Stephen Tu

We consider adaptive control of the Linear Quadratic Regulator (LQR), where an unknown linear system is controlled subject to quadratic costs.

Paper
Add Code

Learning Contracting Vector Fields For Stable Imitation Learning

no code implementations • 13 Apr 2018 • Vikas Sindhwani, Stephen Tu, Mohi Khansari

We propose a new non-parametric framework for learning incrementally stable dynamical systems x' = f(x) from a set of sampled trajectories.

Imitation Learning

Paper
Add Code

Learning Without Mixing: Towards A Sharp Analysis of Linear System Identification

no code implementations • 22 Feb 2018 • Max Simchowitz, Horia Mania, Stephen Tu, Michael. I. Jordan, Benjamin Recht

We prove that the ordinary least-squares (OLS) estimator attains nearly minimax optimal performance for the identification of linear dynamical systems from a single observed trajectory.

Time Series Time Series Analysis

Paper
Add Code

Least-Squares Temporal Difference Learning for the Linear Quadratic Regulator

no code implementations • ICML 2018 • Stephen Tu, Benjamin Recht

Reinforcement learning (RL) has been successfully used to solve many continuous control tasks.

Continuous Control Reinforcement Learning (RL)

Paper
Add Code

On the Sample Complexity of the Linear Quadratic Regulator

no code implementations • 4 Oct 2017 • Sarah Dean, Horia Mania, Nikolai Matni, Benjamin Recht, Stephen Tu

This paper addresses the optimal control problem known as the Linear Quadratic Regulator in the case when the dynamics are unknown.

Paper
Add Code

Non-Asymptotic Analysis of Robust Control from Coarse-Grained Identification

no code implementations • 15 Jul 2017 • Stephen Tu, Ross Boczar, Andrew Packard, Benjamin Recht

We derive bounds on the number of noisy input/output samples from a stable linear time-invariant system that are sufficient to guarantee that the corresponding finite impulse response approximation is close to the true system in the $\mathcal{H}_\infty$-norm.

Paper
Add Code

CYCLADES: Conflict-free Asynchronous Machine Learning

1 code implementation • NeurIPS 2016 • Xinghao Pan, Maximilian Lam, Stephen Tu, Dimitris Papailiopoulos, Ce Zhang, Michael. I. Jordan, Kannan Ramchandran, Chris Re, Benjamin Recht

We present CYCLADES, a general framework for parallelizing stochastic optimization algorithms in a shared memory setting.

BIG-bench Machine Learning Stochastic Optimization

Paper
Code

Large Scale Kernel Learning using Block Coordinate Descent

no code implementations • 17 Feb 2016 • Stephen Tu, Rebecca Roelofs, Shivaram Venkataraman, Benjamin Recht

We demonstrate that distributed block coordinate descent can quickly solve kernel regression and classification problems with millions of data points.

Classification General Classification +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.