Optimism-Based Adaptive Regulation of Linear-Quadratic Systems

20 Nov 2017  ·  Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis ·

The main challenge for adaptive regulation of linear-quadratic systems is the trade-off between identification and control. An adaptive policy needs to address both the estimation of unknown dynamics parameters (exploration), as well as the regulation of the underlying system (exploitation). To this end, optimism-based methods which bias the identification in favor of optimistic approximations of the true parameter are employed in the literature. A number of asymptotic results have been established, but their finite time counterparts are few, with important restrictions. This study establishes results for the worst-case regret of optimism-based adaptive policies. The presented high probability upper bounds are optimal up to logarithmic factors. The non-asymptotic analysis of this work requires very mild assumptions; (i) stabilizability of the system's dynamics, and (ii) limiting the degree of heaviness of the noise distribution. To establish such bounds, certain novel techniques are developed to comprehensively address the probabilistic behavior of dependent random matrices with heavy-tailed distributions.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here