no code implementations • 3 Aug 2024 • Baiyu Peng, Aude Billard
Within our framework, we treat all data in demonstrations as positive (feasible) data, and learn a control policy to generate potentially infeasible trajectories, which serve as unlabeled data.
no code implementations • 23 Jul 2024 • Baiyu Peng, Aude Billard
Planning for a wide range of real-world tasks necessitates to know and write all constraints.
no code implementations • 26 Aug 2021 • Baiyu Peng, Jingliang Duan, Jianyu Chen, Shengbo Eben Li, Genjin Xie, Congsheng Zhang, Yang Guan, Yao Mu, Enxin Sun
Based on this, the penalty method is formulated as a proportional controller, and the Lagrangian method is formulated as an integral controller.
no code implementations • 17 Feb 2021 • Baiyu Peng, Yao Mu, Jingliang Duan, Yang Guan, Shengbo Eben Li, Jianyu Chen
Taking a control perspective, we first interpret the penalty method and the Lagrangian method as proportional feedback and integral feedback control, respectively.
no code implementations • 19 Dec 2020 • Baiyu Peng, Yao Mu, Yang Guan, Shengbo Eben Li, Yuming Yin, Jianyu Chen
Safety is essential for reinforcement learning (RL) applied in real-world situations.
no code implementations • 28 Feb 2020 • Yao Mu, Shengbo Eben Li, Chang Liu, Qi Sun, Bingbing Nie, Bo Cheng, Baiyu Peng
This paper presents a mixed reinforcement learning (mixed RL) algorithm by simultaneously using dual representations of environmental dynamics to search the optimal policy with the purpose of improving both learning accuracy and training speed.