no code implementations • 4 Apr 2024 • Darioush Kevian, Usman Syed, Xingang Guo, Aaron Havens, Geir Dullerud, Peter Seiler, Lianhui Qin, Bin Hu
In this paper, we explore the capabilities of state-of-the-art large language models (LLMs) such as GPT-4, Claude 3 Opus, and Gemini 1. 0 Ultra in solving undergraduate-level control problems.
no code implementations • 18 Feb 2024 • Darioush Keivan, Xingang Guo, Peter Seiler, Geir Dullerud, Bin Hu
Built upon such a policy optimization persepctive, our paper extends these subgradient-based search methods to a model-free setting.
no code implementations • 3 Jan 2022 • Aaron Havens, Darioush Keivan, Peter Seiler, Geir Dullerud, Bin Hu
We show that the ROA analysis can be approximated as a constrained maximization problem whose goal is to find the worst-case initial condition which shifts the terminal state the most.
no code implementations • 30 Nov 2021 • Darioush Keivan, Aaron Havens, Peter Seiler, Geir Dullerud, Bin Hu
We build a connection between robust adversarial RL and $\mu$ synthesis, and develop a model-free version of the well-known $DK$-iteration for solving state-feedback $\mu$ synthesis with static $D$-scaling.
no code implementations • 24 Nov 2020 • Joao Paulo Jansch-Porto, Bin Hu, Geir Dullerud
In this paper, we investigate the global convergence of gradient-based policy optimization methods for quadratic optimal control of discrete-time Markovian jump linear systems (MJLS).
1 code implementation • L4DC 2020 • Joao Paulo Jansch-Porto, Bin Hu, Geir Dullerud
We implement the (data-driven) natural policy gradient method on different MJLS examples.
no code implementations • 10 Feb 2020 • Joao Paulo Jansch-Porto, Bin Hu, Geir Dullerud
Recently, policy optimization for control purposes has received renewed attention due to the increasing interest in reinforcement learning.
no code implementations • 4 Nov 2019 • Negin Musavi, Dawei Sun, Sayan Mitra, Geir Dullerud, Sanjay Shakkottai
As a consequence, we obtain theoretical regret bounds on sample efficiency of our solution that depends on key problem parameters like smoothness, near-optimality dimension, and batch size.