1 code implementation • 30 Sep 2022 • Yoshihiro Okawa, Tomotake Sasaki, Hitoshi Yanami, Toru Namerikawa
We define the safety during learning as satisfaction of the constraint conditions explicitly defined in terms of the state and propose a safe exploration method that uses partial prior knowledge of a controlled object and disturbance.
1 code implementation • 5 Mar 2021 • Mei Minami, Yuka Masumoto, Yoshihiro Okawa, Tomotake Sasaki, Yutaka Hori
To overcome this limitation, we propose a model-free two-step design approach that improves the transient learning performance of RL in an optimal regulator redesign problem for unknown nonlinear systems.
no code implementations • 5 Mar 2021 • Yoshihiro Okawa, Tomotake Sasaki, Hidenao Iwane
In reinforcement learning (RL) algorithms, exploratory control inputs are used during learning to acquire knowledge for decision making and control, while the true dynamics of a controlled object is unknown.