1 code implementation • 26 Sep 2023 • Chenyang Miao, Yunduan Cui, Huiyun Li, Xinyu Wu
It alleviates the inconsistency of multiple agents' policy updates by introducing the relative entropy regularization to the Centralized Training with Decentralized Execution (CTDE) framework with the Actor-Critic (AC) structure.
1 code implementation • 20 Sep 2023 • Wenjun Huang, Yunduan Cui, Huiyun Li, Xinyu Wu
Its loss function is designed to correct the fitting error of neural networks for more accurate prediction of probabilistic models.
no code implementations • 16 Oct 2020 • Cheng-Yu Kuo, Andreas Schaarschmidt, Yunduan Cui, Tamim Asfour, Takamitsu Matsubara
In typical MBRL, we cannot expect the data-driven model to generate accurate and reliable policies to the intended robotic tasks during the learning process due to sample scarcity.
Model-based Reinforcement Learning reinforcement-learning +1