no code implementations • 9 Dec 2023 • Jaeuk Shin, Giho Kim, Howon Lee, Joonho Han, Insoon Yang
Designing a competent meta-reinforcement learning (meta-RL) algorithm in terms of data usage remains a central challenge to be tackled for its successful real-world applications.
1 code implementation • 16 May 2023 • Astghik Hakobyan, Insoon Yang
To achieve this, we use the Kantrovich duality principle to decompose the value function in a novel way and derive closed-form expressions of the distributionally robust control and worst-case distribution policies to be used in each iteration of our DDP algorithm.
no code implementations • 9 Dec 2022 • Astghik Hakobyan, Insoon Yang
Distributionally robust control (DRC) aims to effectively manage distributional ambiguity in stochastic systems.
no code implementations • 28 Nov 2022 • MinGyu Park, Jaeuk Shin, Insoon Yang
Inspired by the quasi-Newton interpretation of AA, we propose a maximum entropy variant of QMDP, which we call soft QMDP, to fully benefit from AA.
no code implementations • 31 Mar 2022 • Astghik Hakobyan, Insoon Yang
The key idea is to reformulate the WDRC problem as a novel minimax control problem with an approximate Wasserstein penalty.
no code implementations • 29 Mar 2022 • Youngchae Cho, Insoon Yang
The proposed model is formulated as a WDRO problem relying on an affine policy, which nests an infinite-dimensional worst-case expectation problem and satisfies the non-anticipativity constraint.
no code implementations • 5 Nov 2021 • Yeoneung Kim, Insoon Yang, Kwang-Sung Jun
For linear bandits, we achieve $\tilde O(\min\{d\sqrt{K}, d^{1. 5}\sqrt{\sum_{k=1}^K \sigma_k^2}\} + d^2)$ where $d$ is the dimension of the features, $K$ is the time horizon, and $\sigma_k^2$ is the noise variance at time step $k$, and $\tilde O$ ignores polylogarithmic dependence, which is a factor of $d^3$ improvement.
no code implementations • 27 Oct 2021 • Dohyun Kwon, Yeoneung Kim, Guido Montúfar, Insoon Yang
We propose a stable method to train Wasserstein generative adversarial networks.
no code implementations • 29 Mar 2021 • Melike Ermis, MinGyu Park, Insoon Yang
This paper proposes an accelerated method for approximately solving partially observable Markov decision process (POMDP) problems offline.
1 code implementation • 28 Jan 2021 • Margaret P. Chapman, Riccardo Bonalli, Kevin M. Smith, Insoon Yang, Marco Pavone, Claire J. Tomlin
In addition, we propose a second definition for risk-sensitive safe sets and provide a tractable method for their estimation without using a parameter-dependent upper bound.
1 code implementation • 27 Oct 2020 • Jeongho Kim, Jaeuk Shin, Insoon Yang
In this paper, we propose Q-learning algorithms for continuous-time deterministic optimal control problems with Lipschitz continuous controls.
1 code implementation • 24 Feb 2020 • Subin Huh, Insoon Yang
Our approach constructs a Lyapunov function with respect to a safe policy to restrain each policy improvement stage.
no code implementations • L4DC 2020 • Jeongho Kim, Insoon Yang
The performance of the proposed Q-learning algorithm is demonstrated using 1-, 10- and 20-dimensional dynamical systems.