no code implementations • 22 Mar 2025 • Ziang Zheng, Guojian Zhan, Bin Shuai, Shengtao Qin, Jiangtao Li, Tao Zhang, Shengbo Eben Li
We validate our approach through extensive simulations and real-world experiments, demonstrating that the pretrained latent-to-latent locomotion policy effectively generalizes to new robot entities and tasks with improved efficiency.
no code implementations • 25 Jan 2025 • Tianqi Zhang, Puzhen Yuan, Guojian Zhan, Ziyu Lin, Yao Lyu, Zhenzhi Qin, Jingliang Duan, Liping Zhang, Shengbo Eben Li
And we prove that the resulting optimal policy, achieved through alternating MFOCP and MGPL, aligns with the solution of the primal constrained RL problem, thereby establishing our equivalence framework.
no code implementations • 26 Nov 2024 • Guojian Zhan, Qiang Ge, Haoyu Gao, Yuming Yin, Bin Zhao, Shengbo Eben Li
Subsequent to the validation process, we conduct comprehensive simulations comparing our proposed model with both kinematic models and existing dynamic models discretized through the forward Euler method.
no code implementations • 21 Jul 2024 • YuXuan Jiang, Yujie Yang, Zhiqian Lan, Guojian Zhan, Shengbo Eben Li, Qi Sun, Jian Ma, Tianwen Yu, Changwu Zhang
Our approach, called Random Annealing Jump Start (RAJS), is tailored for real-world goal-oriented problems by leveraging prior feedback controllers as guide policy to facilitate environmental exploration and policy learning in RL.
no code implementations • 4 Mar 2024 • Guojian Zhan, Ziang Zheng, Shengbo Eben Li
This paper for the first time introduces the concept of canonical data form for the purpose of achieving more effective design of datatic controllers.
no code implementations • 11 Oct 2023 • Zeyang Li, Chuxiong Hu, Yunan Wang, Guojian Zhan, Jie Li, Shengbo Eben Li
We also show that a modified version of regularized policy iteration, i. e., with finite-step policy evaluation, is equivalent to inexact Newton method where the Newton iteration formula is solved with truncated iterations.
1 code implementation • 29 Jan 2022 • YuHeng Lei, Yao Lyu, Guojian Zhan, Tao Zhang, Jiangtao Li, Jianyu Chen, Shengbo Eben Li, Sifa Zheng
We propose to use step-wise exploration in parameter space and theoretically derive the zeroth-order policy gradient.