no code implementations • 3 Apr 2024 • Xiangyuan Zhang, Weichao Mao, Haoran Qiu, Tamer Başar
Closed-loop control of nonlinear dynamical systems with partial-state observability demands expert knowledge of a diverse, less standardized set of theoretical tools.
no code implementations • 1 Mar 2024 • Xiangyuan Zhang, Saviz Mowlavi, Mouhacine Benosman, Tamer Başar
The PO step fine-tunes the model-based controller to compensate for the modeling error from dimensionality reduction.
1 code implementation • 30 Nov 2023 • Xiangyuan Zhang, Weichao Mao, Saviz Mowlavi, Mouhacine Benosman, Tamer Başar
This project serves the learning for dynamics & control (L4DC) community, aiming to explore key questions: the convergence of RL algorithms in learning control policies; the stability and robustness issues of learning-based controllers; and the scalability of RL algorithms to high- and potentially infinite-dimensional systems.
1 code implementation • 9 Sep 2023 • Xiangyuan Zhang, Saviz Mowlavi, Mouhacine Benosman, Tamer Başar
We introduce the receding-horizon policy gradient (RHPG) algorithm, the first PG algorithm with provable global convergence in learning the optimal linear estimator designs, i. e., the Kalman filter (KF).
no code implementations • 25 Feb 2023 • Xiangyuan Zhang, Tamer Başar
We revisit in this paper the discrete-time linear quadratic regulator (LQR) problem from the perspective of receding-horizon policy gradient (RHPG), a newly developed model-free learning framework for control applications.
no code implementations • 30 Jan 2023 • Xiangyuan Zhang, Bin Hu, Tamer Başar
We develop the first end-to-end sample complexity of model-free policy gradient (PG) methods in discrete-time infinite-horizon Kalman filtering.
no code implementations • NeurIPS 2021 • Kaiqing Zhang, Xiangyuan Zhang, Bin Hu, Tamer Başar
Direct policy search serves as one of the workhorses in modern reinforcement learning (RL), and its applications in continuous control tasks have recently attracted increasing attention.
no code implementations • NeurIPS 2019 • Xiangyuan Zhang, Kaiqing Zhang, Erik Miehling, Tamer Başar
Through interacting with the more informed player, the less informed player attempts to both infer, and act according to, the true objective function.