Search Results for author: Haoyong Yu

Found 2 papers, 1 papers with code

Off-policy Maximum Entropy Reinforcement Learning : Soft Actor-Critic with Advantage Weighted Mixture Policy(SAC-AWMP)

no code implementations7 Feb 2020 Zhimin Hou, Kuangen Zhang, Yi Wan, Dongyu Li, Chenglong Fu, Haoyong Yu

A common way to solve this problem, known as Mixture-of-Experts, is to represent the policy as the weighted sum of multiple components, where different components perform well on different parts of the state space.

Continuous Control

Teach Biped Robots to Walk via Gait Principles and Reinforcement Learning with Adversarial Critics

1 code implementation22 Oct 2019 Kuangen Zhang, Zhimin Hou, Clarence W. de Silva, Haoyong Yu, Chenglong Fu

However, the local minima caused by unsuitable rewards and the overestimation of the cumulative reward impede the maximization of the cumulative reward.

Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.