no code implementations • 9 Apr 2025 • Elizabeth Dietrich, Rosalyn Devonport, Stephen Tu, Murat Arcak
Reachability analysis is an important method in providing safety guarantees for systems with unknown or uncertain dynamics.
no code implementations • 24 Nov 2024 • Hesameddin Mohammadi, Mohammad Tinati, Stephen Tu, Mahdi Soltanolkotabi, Mihailo R. Jovanović
We demonstrate that the Schur complement to a principal eigenspace of the target matrix is governed by an autonomous system that is decoupled from the rest of the dynamics.
no code implementations • 15 Oct 2024 • Nicholas M. Boffi, Arthur Jacot, Stephen Tu, Ingvar Ziemann
Diffusion-based generative models provide a powerful framework for learning to sample from a complex target distribution.
no code implementations • 19 Sep 2024 • Paul Lutkus, Deepika Anantharaman, Stephen Tu, Lars Lindemann
We consider the problem of safely exploring a static and unknown environment while learning valid control barrier functions (CBFs) from sensor data.
no code implementations • 8 Feb 2024 • Ingvar Ziemann, Stephen Tu, George J. Pappas, Nikolai Matni
In this work, we study statistical learning with dependent ($\beta$-mixing) data and square loss in a hypothesis class $\mathscr{F}\subset L_{\Psi_p}$ where $\Psi_p$ is the norm $\|f\|_{\Psi_p} \triangleq \sup_{m\geq 1} m^{-1/p} \|f\|_{L^m} $ for some $p\in [2,\infty]$.
no code implementations • 11 Sep 2023 • Sumeet Singh, Stephen Tu, Vikas Sindhwani
In this work, we revisit the choice of energy-based models (EBM) as a policy class.
no code implementations • 4 Jul 2023 • Allen Z. Ren, Anushri Dixit, Alexandra Bodrova, Sumeet Singh, Stephen Tu, Noah Brown, Peng Xu, Leila Takayama, Fei Xia, Jake Varley, Zhenjia Xu, Dorsa Sadigh, Andy Zeng, Anirudha Majumdar
Large language models (LLMs) exhibit a wide range of promising capabilities -- from step-by-step planning to commonsense reasoning -- that may provide utility for robots, but remain prone to confidently hallucinated predictions.
no code implementations • 16 Jun 2023 • Charline Le Lan, Stephen Tu, Mark Rowland, Anna Harutyunyan, Rishabh Agarwal, Marc G. Bellemare, Will Dabney
In this paper, we address this gap and provide a theoretical characterization of the state representation learnt by temporal difference learning (Sutton, 1988).
no code implementations • 20 May 2023 • Amir Ali Ahmadi, Abraar Chaudhry, Vikas Sindhwani, Stephen Tu
For $T = \infty$, we provide SDP-representable inner approximations of the set of safe initial conditions and show that one trajectory generically suffices for safe learning.
no code implementations • 16 May 2023 • Daniel Pfrommer, Max Simchowitz, Tyler Westenbroek, Nikolai Matni, Stephen Tu
A common pipeline in learning-based control is to iteratively estimate a model of system dynamics, and apply a trajectory optimization algorithm - e. g.~$\mathtt{iLQR}$ - on the learned model to minimize a target cost.
no code implementations • 1 Dec 2022 • Thomas T. Zhang, Katie Kang, Bruce D. Lee, Claire Tomlin, Sergey Levine, Stephen Tu, Nikolai Matni
In particular, we consider a setting where learning is split into two phases: (a) a pre-training step where a shared $k$-dimensional representation is learned from $H$ source policies, and (b) a target policy fine-tuning step where the learned representation is used to parameterize the policy class.
no code implementations • 5 Oct 2022 • David Brandfonbrener, Stephen Tu, Avi Singh, Stefan Welker, Chad Boodoo, Nikolai Matni, Jake Varley
We find that by adjusting the data collection process we improve the quality of both the learned value functions and policies over a variety of baseline methods for data collection.
no code implementations • 22 Sep 2022 • Xuesu Xiao, Tingnan Zhang, Krzysztof Choromanski, Edward Lee, Anthony Francis, Jake Varley, Stephen Tu, Sumeet Singh, Peng Xu, Fei Xia, Sven Mikael Persson, Dmitry Kalashnikov, Leila Takayama, Roy Frostig, Jie Tan, Carolina Parada, Vikas Sindhwani
Despite decades of research, existing navigation systems still face real-world challenges when deployed in the wild, e. g., in cluttered home environments or in human-occupied public spaces.
1 code implementation • 16 Jun 2022 • Ingvar Ziemann, Stephen Tu
We study square loss in a realizable time-series framework with martingale difference noise.
1 code implementation • 30 May 2022 • Daniel Pfrommer, Thomas T. C. K. Zhang, Stephen Tu, Nikolai Matni
We propose Taylor Series Imitation Learning (TaSIL), a simple augmentation to standard behavior cloning losses in the context of continuous control.
no code implementations • 31 Mar 2022 • Stephen Tu, Roy Frostig, Mahdi Soltanolkotabi
Specifically, we establish that the worst-case error rate of this problem is $\Theta(n / m T)$ whenever $m \gtrsim n$.
1 code implementation • 1 Mar 2022 • Charline Le Lan, Stephen Tu, Adam Oberman, Rishabh Agarwal, Marc G. Bellemare
We complement our theoretical results with an empirical survey of classic representation learning methods from the literature and results on the Arcade Learning Environment, and find that the generalization behaviour of learned representations is well-explained by their effective dimension.
no code implementations • 20 Dec 2021 • Thomas T. C. K. Zhang, Stephen Tu, Nicholas M. Boffi, Jean-Jacques E. Slotine, Nikolai Matni
Motivated by bridging the simulation to reality gap in the context of safety-critical systems, we consider learning adversarially robust stability certificates for unknown nonlinear dynamical systems.
1 code implementation • 18 Nov 2021 • Lars Lindemann, Alexander Robey, Lejun Jiang, Satyajeet Das, Stephen Tu, Nikolai Matni
Along with the optimization problem, we provide verifiable conditions in terms of the density of the data, smoothness of the system model and state estimator, and the size of the error bounds that guarantee validity of the obtained ROCBF.
no code implementations • 7 Jun 2021 • Nicholas M. Boffi, Stephen Tu, Jean-Jacques E. Slotine
A key assumption in the theory of nonlinear adaptive control is that the uncertainty of the system can be expressed in the linear span of a set of known basis functions.
no code implementations • 18 Feb 2021 • Stephen Tu, Alexander Robey, Tingnan Zhang, Nikolai Matni
We study the following question in the context of imitation learning for continuous control: how are the underlying stability properties of an expert policy reflected in the sample-complexity of an imitation learning task?
1 code implementation • 16 Jan 2021 • Alexander Robey, Lars Lindemann, Stephen Tu, Nikolai Matni
We identify sufficient conditions on the data such that feasibility of the optimization problem ensures correctness of the learned robust hybrid control barrier functions.
no code implementations • 26 Nov 2020 • Nicholas M. Boffi, Stephen Tu, Jean-Jacques E. Slotine
We study the problem of adaptively controlling a known discrete-time nonlinear system subject to unmodeled disturbances.
no code implementations • 24 Nov 2020 • Amir Ali Ahmadi, Abraar Chaudhry, Vikas Sindhwani, Stephen Tu
For our first two results, we consider the setting of safely learning linear dynamics.
no code implementations • 8 Nov 2020 • Lars Lindemann, Haimin Hu, Alexander Robey, Hanwen Zhang, Dimos V. Dimarogonas, Stephen Tu, Nikolai Matni
Motivated by the lack of systematic tools to obtain safe control laws for hybrid systems, we propose an optimization-based framework for learning certifiably safe control laws from data.
no code implementations • 13 Aug 2020 • Nicholas M. Boffi, Stephen Tu, Nikolai Matni, Jean-Jacques E. Slotine, Vikas Sindhwani
Many existing tools in nonlinear control theory for establishing stability or safety of a dynamical system can be distilled to the construction of a certificate function that guarantees a desired property.
no code implementations • 15 Jun 2020 • Nicholas M. Boffi, Stephen Tu, Jean-Jacques E. Slotine
Recent numerical experiments have demonstrated that the choice of optimization geometry used during training can impact generalization performance when learning expressive nonlinear model classes such as deep neural networks.
1 code implementation • 7 Apr 2020 • Alexander Robey, Haimin Hu, Lars Lindemann, Hanwen Zhang, Dimos V. Dimarogonas, Stephen Tu, Nikolai Matni
Furthermore, if the CBF parameterization is convex, then under mild assumptions, so is our learning process.
no code implementations • ICLR 2020 • Xingyou Song, Yiding Jiang, Stephen Tu, Yilun Du, Behnam Neyshabur
A major component of overfitting in model-free reinforcement learning (RL) involves the case where the agent may mistakenly correlate reward with certain spurious features from the observations generated by the Markov Decision Process (MDP).
no code implementations • 27 Jun 2019 • Nikolai Matni, Stephen Tu
We provide a brief tutorial on the use of concentration inequalities as they apply to system identification of state-space parameters of linear time invariant systems, with a focus on the fully observed setting.
no code implementations • 27 Jun 2019 • Nikolai Matni, Alexandre Proutiere, Anders Rantzer, Stephen Tu
Machine and reinforcement learning (RL) are increasingly being applied to plan and control the behavior of autonomous systems interacting with the physical world.
no code implementations • NeurIPS 2019 • Karl Krauth, Stephen Tu, Benjamin Recht
We study the sample complexity of approximate policy iteration (PI) for the Linear Quadratic Regulator (LQR), building on a recent line of work using LQR as a testbed to understand the limits of reinforcement learning (RL) algorithms on continuous control tasks.
no code implementations • NeurIPS 2019 • Horia Mania, Stephen Tu, Benjamin Recht
We show that for both the fully and partially observed settings, the sub-optimality gap between the cost incurred by playing the certainty equivalent controller on the true system and the cost incurred by using the optimal LQ controller enjoys a fast statistical rate, scaling as the square of the parameter error.
no code implementations • 9 Dec 2018 • Stephen Tu, Benjamin Recht
The effectiveness of model-based versus model-free methods is a long-standing question in reinforcement learning (RL).
no code implementations • 28 Sep 2018 • Stephen Tu, Ross Boczar, Benjamin Recht
The problem of estimating the $\mathcal{H}_\infty$-norm of an LTI system from noisy input/output measurements has attracted recent attention as an alternative to parameter identification for bounding unmodeled dynamics in robust control.
2 code implementations • 26 Sep 2018 • Sarah Dean, Stephen Tu, Nikolai Matni, Benjamin Recht
We study the constrained linear quadratic regulator with unknown dynamics, addressing the tension between safety and exploration in data-driven control techniques.
no code implementations • NeurIPS 2018 • Sarah Dean, Horia Mania, Nikolai Matni, Benjamin Recht, Stephen Tu
We consider adaptive control of the Linear Quadratic Regulator (LQR), where an unknown linear system is controlled subject to quadratic costs.
no code implementations • 13 Apr 2018 • Vikas Sindhwani, Stephen Tu, Mohi Khansari
We propose a new non-parametric framework for learning incrementally stable dynamical systems x' = f(x) from a set of sampled trajectories.
no code implementations • 22 Feb 2018 • Max Simchowitz, Horia Mania, Stephen Tu, Michael. I. Jordan, Benjamin Recht
We prove that the ordinary least-squares (OLS) estimator attains nearly minimax optimal performance for the identification of linear dynamical systems from a single observed trajectory.
no code implementations • ICML 2018 • Stephen Tu, Benjamin Recht
Reinforcement learning (RL) has been successfully used to solve many continuous control tasks.
no code implementations • 4 Oct 2017 • Sarah Dean, Horia Mania, Nikolai Matni, Benjamin Recht, Stephen Tu
This paper addresses the optimal control problem known as the Linear Quadratic Regulator in the case when the dynamics are unknown.
no code implementations • 15 Jul 2017 • Stephen Tu, Ross Boczar, Andrew Packard, Benjamin Recht
We derive bounds on the number of noisy input/output samples from a stable linear time-invariant system that are sufficient to guarantee that the corresponding finite impulse response approximation is close to the true system in the $\mathcal{H}_\infty$-norm.
1 code implementation • NeurIPS 2016 • Xinghao Pan, Maximilian Lam, Stephen Tu, Dimitris Papailiopoulos, Ce Zhang, Michael. I. Jordan, Kannan Ramchandran, Chris Re, Benjamin Recht
We present CYCLADES, a general framework for parallelizing stochastic optimization algorithms in a shared memory setting.
no code implementations • 17 Feb 2016 • Stephen Tu, Rebecca Roelofs, Shivaram Venkataraman, Benjamin Recht
We demonstrate that distributed block coordinate descent can quickly solve kernel regression and classification problems with millions of data points.