You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

no code implementations • 11 Sep 2023 • Sumeet Singh, Stephen Tu, Vikas Sindhwani

In this work, we revisit the choice of energy-based models (EBM) as a policy class.

no code implementations • 4 Jul 2023 • Allen Z. Ren, Anushri Dixit, Alexandra Bodrova, Sumeet Singh, Stephen Tu, Noah Brown, Peng Xu, Leila Takayama, Fei Xia, Jake Varley, Zhenjia Xu, Dorsa Sadigh, Andy Zeng, Anirudha Majumdar

Large language models (LLMs) exhibit a wide range of promising capabilities -- from step-by-step planning to commonsense reasoning -- that may provide utility for robots, but remain prone to confidently hallucinated predictions.

no code implementations • 16 Jun 2023 • Charline Le Lan, Stephen Tu, Mark Rowland, Anna Harutyunyan, Rishabh Agarwal, Marc G. Bellemare, Will Dabney

In this paper, we address this gap and provide a theoretical characterization of the state representation learnt by temporal difference learning (Sutton, 1988).

no code implementations • 20 May 2023 • Amir Ali Ahmadi, Abraar Chaudhry, Vikas Sindhwani, Stephen Tu

For $T=2$, we give a semidefinite representation of the set of safe initial conditions and show that $\lceil n/2 \rceil$ trajectories generically suffice for safe learning.

no code implementations • 18 May 2023 • Ingvar Ziemann, Stephen Tu, George J. Pappas, Nikolai Matni

We derive upper bounds for random design linear regression with dependent ($\beta$-mixing) data absent any realizability assumptions.

no code implementations • 16 May 2023 • Daniel Pfrommer, Max Simchowitz, Tyler Westenbroek, Nikolai Matni, Stephen Tu

A common pipeline in learning-based control is to iteratively estimate a model of system dynamics, and apply a trajectory optimization algorithm - e. g.~$\mathtt{iLQR}$ - on the learned model to minimize a target cost.

no code implementations • 1 Dec 2022 • Thomas T. Zhang, Katie Kang, Bruce D. Lee, Claire Tomlin, Sergey Levine, Stephen Tu, Nikolai Matni

In particular, we consider a setting where learning is split into two phases: (a) a pre-training step where a shared $k$-dimensional representation is learned from $H$ source policies, and (b) a target policy fine-tuning step where the learned representation is used to parameterize the policy class.

no code implementations • 5 Oct 2022 • David Brandfonbrener, Stephen Tu, Avi Singh, Stefan Welker, Chad Boodoo, Nikolai Matni, Jake Varley

We find that by adjusting the data collection process we improve the quality of both the learned value functions and policies over a variety of baseline methods for data collection.

no code implementations • 22 Sep 2022 • Xuesu Xiao, Tingnan Zhang, Krzysztof Choromanski, Edward Lee, Anthony Francis, Jake Varley, Stephen Tu, Sumeet Singh, Peng Xu, Fei Xia, Sven Mikael Persson, Dmitry Kalashnikov, Leila Takayama, Roy Frostig, Jie Tan, Carolina Parada, Vikas Sindhwani

Despite decades of research, existing navigation systems still face real-world challenges when deployed in the wild, e. g., in cluttered home environments or in human-occupied public spaces.

1 code implementation • 16 Jun 2022 • Ingvar Ziemann, Stephen Tu

We study square loss in a realizable time-series framework with martingale difference noise.

1 code implementation • 30 May 2022 • Daniel Pfrommer, Thomas T. C. K. Zhang, Stephen Tu, Nikolai Matni

We propose Taylor Series Imitation Learning (TaSIL), a simple augmentation to standard behavior cloning losses in the context of continuous control.

no code implementations • 31 Mar 2022 • Stephen Tu, Roy Frostig, Mahdi Soltanolkotabi

Specifically, we establish that the worst-case error rate of this problem is $\Theta(n / m T)$ whenever $m \gtrsim n$.

1 code implementation • 1 Mar 2022 • Charline Le Lan, Stephen Tu, Adam Oberman, Rishabh Agarwal, Marc G. Bellemare

We complement our theoretical results with an empirical survey of classic representation learning methods from the literature and results on the Arcade Learning Environment, and find that the generalization behaviour of learned representations is well-explained by their effective dimension.

no code implementations • 20 Dec 2021 • Thomas T. C. K. Zhang, Stephen Tu, Nicholas M. Boffi, Jean-Jacques E. Slotine, Nikolai Matni

Motivated by bridging the simulation to reality gap in the context of safety-critical systems, we consider learning adversarially robust stability certificates for unknown nonlinear dynamical systems.

1 code implementation • 18 Nov 2021 • Lars Lindemann, Alexander Robey, Lejun Jiang, Stephen Tu, Nikolai Matni

We then present an optimization problem to learn ROCBFs from expert demonstrations that exhibit safe system behavior, e. g., data collected from a human operator.

no code implementations • 7 Jun 2021 • Nicholas M. Boffi, Stephen Tu, Jean-Jacques E. Slotine

A key assumption in the theory of nonlinear adaptive control is that the uncertainty of the system can be expressed in the linear span of a set of known basis functions.

no code implementations • 18 Feb 2021 • Stephen Tu, Alexander Robey, Tingnan Zhang, Nikolai Matni

We study the following question in the context of imitation learning for continuous control: how are the underlying stability properties of an expert policy reflected in the sample-complexity of an imitation learning task?

1 code implementation • 16 Jan 2021 • Alexander Robey, Lars Lindemann, Stephen Tu, Nikolai Matni

We identify sufficient conditions on the data such that feasibility of the optimization problem ensures correctness of the learned robust hybrid control barrier functions.

no code implementations • 26 Nov 2020 • Nicholas M. Boffi, Stephen Tu, Jean-Jacques E. Slotine

We study the problem of adaptively controlling a known discrete-time nonlinear system subject to unmodeled disturbances.

no code implementations • 24 Nov 2020 • Amir Ali Ahmadi, Abraar Chaudhry, Vikas Sindhwani, Stephen Tu

For our first two results, we consider the setting of safely learning linear dynamics.

no code implementations • 8 Nov 2020 • Lars Lindemann, Haimin Hu, Alexander Robey, Hanwen Zhang, Dimos V. Dimarogonas, Stephen Tu, Nikolai Matni

Motivated by the lack of systematic tools to obtain safe control laws for hybrid systems, we propose an optimization-based framework for learning certifiably safe control laws from data.

no code implementations • 13 Aug 2020 • Nicholas M. Boffi, Stephen Tu, Nikolai Matni, Jean-Jacques E. Slotine, Vikas Sindhwani

Many existing tools in nonlinear control theory for establishing stability or safety of a dynamical system can be distilled to the construction of a certificate function that guarantees a desired property.

no code implementations • 15 Jun 2020 • Nicholas M. Boffi, Stephen Tu, Jean-Jacques E. Slotine

Recent numerical experiments have demonstrated that the choice of optimization geometry used during training can impact generalization performance when learning expressive nonlinear model classes such as deep neural networks.

1 code implementation • 7 Apr 2020 • Alexander Robey, Haimin Hu, Lars Lindemann, Hanwen Zhang, Dimos V. Dimarogonas, Stephen Tu, Nikolai Matni

Furthermore, if the CBF parameterization is convex, then under mild assumptions, so is our learning process.

no code implementations • ICLR 2020 • Xingyou Song, Yiding Jiang, Stephen Tu, Yilun Du, Behnam Neyshabur

A major component of overfitting in model-free reinforcement learning (RL) involves the case where the agent may mistakenly correlate reward with certain spurious features from the observations generated by the Markov Decision Process (MDP).

no code implementations • 27 Jun 2019 • Nikolai Matni, Stephen Tu

We provide a brief tutorial on the use of concentration inequalities as they apply to system identification of state-space parameters of linear time invariant systems, with a focus on the fully observed setting.

no code implementations • 27 Jun 2019 • Nikolai Matni, Alexandre Proutiere, Anders Rantzer, Stephen Tu

Machine and reinforcement learning (RL) are increasingly being applied to plan and control the behavior of autonomous systems interacting with the physical world.

no code implementations • NeurIPS 2019 • Karl Krauth, Stephen Tu, Benjamin Recht

We study the sample complexity of approximate policy iteration (PI) for the Linear Quadratic Regulator (LQR), building on a recent line of work using LQR as a testbed to understand the limits of reinforcement learning (RL) algorithms on continuous control tasks.

no code implementations • NeurIPS 2019 • Horia Mania, Stephen Tu, Benjamin Recht

We show that for both the fully and partially observed settings, the sub-optimality gap between the cost incurred by playing the certainty equivalent controller on the true system and the cost incurred by using the optimal LQ controller enjoys a fast statistical rate, scaling as the square of the parameter error.

no code implementations • 9 Dec 2018 • Stephen Tu, Benjamin Recht

The effectiveness of model-based versus model-free methods is a long-standing question in reinforcement learning (RL).

no code implementations • 28 Sep 2018 • Stephen Tu, Ross Boczar, Benjamin Recht

The problem of estimating the $\mathcal{H}_\infty$-norm of an LTI system from noisy input/output measurements has attracted recent attention as an alternative to parameter identification for bounding unmodeled dynamics in robust control.

2 code implementations • 26 Sep 2018 • Sarah Dean, Stephen Tu, Nikolai Matni, Benjamin Recht

We study the constrained linear quadratic regulator with unknown dynamics, addressing the tension between safety and exploration in data-driven control techniques.

no code implementations • NeurIPS 2018 • Sarah Dean, Horia Mania, Nikolai Matni, Benjamin Recht, Stephen Tu

We consider adaptive control of the Linear Quadratic Regulator (LQR), where an unknown linear system is controlled subject to quadratic costs.

no code implementations • 13 Apr 2018 • Vikas Sindhwani, Stephen Tu, Mohi Khansari

We propose a new non-parametric framework for learning incrementally stable dynamical systems x' = f(x) from a set of sampled trajectories.

no code implementations • 22 Feb 2018 • Max Simchowitz, Horia Mania, Stephen Tu, Michael. I. Jordan, Benjamin Recht

We prove that the ordinary least-squares (OLS) estimator attains nearly minimax optimal performance for the identification of linear dynamical systems from a single observed trajectory.

no code implementations • ICML 2018 • Stephen Tu, Benjamin Recht

Reinforcement learning (RL) has been successfully used to solve many continuous control tasks.

no code implementations • 4 Oct 2017 • Sarah Dean, Horia Mania, Nikolai Matni, Benjamin Recht, Stephen Tu

This paper addresses the optimal control problem known as the Linear Quadratic Regulator in the case when the dynamics are unknown.

no code implementations • 15 Jul 2017 • Stephen Tu, Ross Boczar, Andrew Packard, Benjamin Recht

We derive bounds on the number of noisy input/output samples from a stable linear time-invariant system that are sufficient to guarantee that the corresponding finite impulse response approximation is close to the true system in the $\mathcal{H}_\infty$-norm.

1 code implementation • NeurIPS 2016 • Xinghao Pan, Maximilian Lam, Stephen Tu, Dimitris Papailiopoulos, Ce Zhang, Michael. I. Jordan, Kannan Ramchandran, Chris Re, Benjamin Recht

We present CYCLADES, a general framework for parallelizing stochastic optimization algorithms in a shared memory setting.

no code implementations • 17 Feb 2016 • Stephen Tu, Rebecca Roelofs, Shivaram Venkataraman, Benjamin Recht

We demonstrate that distributed block coordinate descent can quickly solve kernel regression and classification problems with millions of data points.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.