no code implementations • 16 May 2022 • Dongjie Yu, Haitong Ma, Shengbo Eben Li, Jianyu Chen
We characterize the feasible set by the established self-consistency condition, then a safety value function can be learned and used as constraints in CRL.
no code implementations • 6 Apr 2022 • Wenhan Cao, Jingliang Duan, Shengbo Eben Li, Chen Chen, Chang Liu, Yu Wang
Both the primal and dual estimators are learned from data using supervised learning techniques, and the explicit sample size is provided, which enables us to guarantee the quality of each learned estimator in terms of feasibility and optimality.
no code implementations • 29 Jan 2022 • YuHeng Lei, Jianyu Chen, Shengbo Eben Li, Sifa Zheng
Zeroth-order optimization methods and policy gradient based first-order methods are two promising alternatives to solve reinforcement learning (RL) problems with complementary advantages.
no code implementations • 25 Nov 2021 • Haitong Ma, Changliu Liu, Shengbo Eben Li, Sifa Zheng, Wenchao Sun, Jianyu Chen
Existing methods mostly use the posterior penalty for dangerous actions, which means that the agent is not penalized until experiencing danger.
no code implementations • 15 Nov 2021 • Haitong Ma, Changliu Liu, Shengbo Eben Li, Sifa Zheng, Jianyu Chen
This paper proposes a novel approach that simultaneously synthesizes the energy-function-based safety certificate and learns the safe control policy with CRL.
no code implementations • 24 Oct 2021 • Yangang Ren, Jianhua Jiang, Dongjie Yu, Shengbo Eben Li, Jingliang Duan, Chen Chen, Keqiang Li
This paper develops the dynamic permutation state representation in the framework of integrated decision and control (IDC) to handle signalized intersections with mixed traffic flows.
no code implementations • 30 Aug 2021 • Jianhua Jiang, Yangang Ren, Yang Guan, Shengbo Eben Li, Yuming Yin, Xiaoping Jin
Autonomous driving at intersections is one of the most complicated and accident-prone traffic scenarios, especially with mixed traffic participants such as vehicles, bicycles and pedestrians.
no code implementations • 26 Aug 2021 • Baiyu Peng, Jingliang Duan, Jianyu Chen, Shengbo Eben Li, Genjin Xie, Congsheng Zhang, Yang Guan, Yao Mu, Enxin Sun
Based on this, the penalty method is formulated as a proportional controller, and the Lagrangian method is formulated as an integral controller.
2 code implementations • 18 Mar 2021 • Yang Guan, Yangang Ren, Qi Sun, Shengbo Eben Li, Haitong Ma, Jingliang Duan, Yifan Dai, Bo Cheng
In this paper, we present an interpretable and computationally efficient framework called integrated decision and control (IDC) for automated vehicles, which decomposes the driving task into static path planning and dynamic optimal tracking that are structured hierarchically.
no code implementations • 9 Mar 2021 • Kaiming Tang, Shengbo Eben Li, Yuming Yin, Yang Guan, Jingliang Duan, Wenhan Cao, Jie Li
The equivalence holds given certain conditions about initial state distributions and policy formats, in which the system state is the estimation error, control input is the filter gain, and control objective function is the accumulated estimation error.
no code implementations • 8 Mar 2021 • Yiting Kong, Yang Guan, Jingliang Duan, Shengbo Eben Li, Qi Sun, Bingbing Nie
In this paper, we propose an RL-based end-to-end decision-making method under a framework of offline training and online correction, called the Shielded Distributional Soft Actor-critic (SDSAC).
1 code implementation • 2 Mar 2021 • Haitong Ma, Jianyu Chen, Shengbo Eben Li, Ziyu Lin, Yang Guan, Yangang Ren, Sifa Zheng
Model information can be used to predict future trajectories, so it has huge potential to avoid dangerous region when implementing reinforcement learning (RL) on real-world tasks, like autonomous driving.
no code implementations • 23 Feb 2021 • Zhengyu Liu, Jingliang Duan, Wenxuan Wang, Shengbo Eben Li, Yuming Yin, Ziyu Lin, Qi Sun, Bo Cheng
This paper proposes an off-line algorithm, called Recurrent Model Predictive Control (RMPC), to solve general nonlinear finite-horizon optimal control problems.
2 code implementations • 23 Feb 2021 • Yang Guan, Jingliang Duan, Shengbo Eben Li, Jie Li, Jianyu Chen, Bo Cheng
MPG contains two types of PG: 1) data-driven PG, which is obtained by directly calculating the derivative of the learned Q-value function with respect to actions, and 2) model-driven PG, which is calculated using BPTT based on the model-predictive return.
no code implementations • 20 Feb 2021 • Zhengyu Liu, Jingliang Duan, Wenxuan Wang, Shengbo Eben Li, Yuming Yin, Ziyu Lin, Bo Cheng
This paper proposes an offline control algorithm, called Recurrent Model Predictive Control (RMPC), to solve large-scale nonlinear finite-horizon optimal control problems.
no code implementations • 17 Feb 2021 • Baiyu Peng, Yao Mu, Jingliang Duan, Yang Guan, Shengbo Eben Li, Jianyu Chen
Taking a control perspective, we first interpret the penalty method and the Lagrangian method as proportional feedback and integral feedback control, respectively.
no code implementations • 16 Feb 2021 • Yuhang Zhang, Yao Mu, Yujie Yang, Yang Guan, Shengbo Eben Li, Qi Sun, Jianyu Chen
Reinforcement learning has shown great potential in developing high-level autonomous driving.
no code implementations • 1 Jan 2021 • Yao Mu, Yuzheng Zhuang, Bin Wang, Wulong Liu, Shengbo Eben Li, Jianye Hao
The latent dynamics model summarizes an agent’s high dimensional experiences in a compact way.
no code implementations • 19 Dec 2020 • Baiyu Peng, Yao Mu, Yang Guan, Shengbo Eben Li, Yuming Yin, Jianyu Chen
Safety is essential for reinforcement learning (RL) applied in real-world situations.
no code implementations • 14 Jul 2020 • Jie Li, Shengbo Eben Li, Yang Guan, Jingliang Duan, Wenyu Li, Yuming Yin
The simulation results show that the TPI algorithm can converge to the optimal solution for the linear plant, and has high resistance to disturbances for the nonlinear plant.
no code implementations • 3 Mar 2020 • Lu Wen, Jingliang Duan, Shengbo Eben Li, Shaobing Xu, Huei Peng
The simulations of two scenarios for autonomous vehicles confirm we can ensure safety while achieving fast learning.
no code implementations • 28 Feb 2020 • Yao Mu, Shengbo Eben Li, Chang Liu, Qi Sun, Bingbing Nie, Bo Cheng, Baiyu Peng
This paper presents a mixed reinforcement learning (mixed RL) algorithm by simultaneously using dual representations of environmental dynamics to search the optimal policy with the purpose of improving both learning accuracy and training speed.
no code implementations • 13 Feb 2020 • Yangang Ren, Jingliang Duan, Shengbo Eben Li, Yang Guan, Qi Sun
In this paper, we introduce the minimax formulation and distributional framework to improve the generalization ability of RL algorithms and develop the Minimax Distributional Soft Actor-Critic (Minimax DSAC) algorithm.
2 code implementations • 23 Jan 2020 • Jianyu Chen, Shengbo Eben Li, Masayoshi Tomizuka
A sequential latent environment model is introduced and learned jointly with the reinforcement learning process.
2 code implementations • 9 Jan 2020 • Jingliang Duan, Yang Guan, Shengbo Eben Li, Yangang Ren, Bo Cheng
In reinforcement learning (RL), function approximation errors are known to easily lead to the Q-value overestimations, thus greatly reducing policy performance.
no code implementations • 23 Dec 2019 • Yang Guan, Shengbo Eben Li, Jingliang Duan, Jie Li, Yangang Ren, Qi Sun, Bo Cheng
Reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks.
no code implementations • 26 Nov 2019 • Jingliang Duan, Zhengyu Liu, Shengbo Eben Li, Qi Sun, Zhenzhong Jia, Bo Cheng
CADP linearizes the constrained optimization problem locally into a quadratically constrained linear programming problem, and then obtains the optimal update of the policy network by solving its dual problem.
no code implementations • 11 Sep 2019 • Jingliang Duan, Shengbo Eben Li, Zhengyu Liu, Monimoy Bujarbaruah, Bo Cheng
This paper proposes the Deep Generalized Policy Iteration (DGPI) algorithm to find the infinite horizon optimal control policy for general nonlinear continuous-time systems with known dynamics.
no code implementations • 6 Jun 2019 • Long Xin, Pin Wang, Ching-Yao Chan, Jianyu Chen, Shengbo Eben Li, Bo Cheng
As autonomous vehicles (AVs) need to interact with other road users, it is of importance to comprehensively understand the dynamic traffic environment, especially the future possible trajectories of surrounding vehicles.
no code implementations • 3 Mar 2017 • Y ang Zheng, Shengbo Eben Li, Keqiang Li, Francesco Borrelli
This paper presents a distributed model predictive control (DMPC) algorithm for heterogeneous vehicle platoons with unidirectional topologies and a p r i o r i unknown desired set point.