no code implementations • 10 Jun 2019 • Chen Qi, Yuxiu Hua, Rongpeng Li, Zhifeng Zhao, Honggang Zhang
Furthermore, as DPGD only works in continuous action space, we embed a k-nearest neighbor algorithm into DQL to quickly find a valid action in the discrete space nearest to the DPGD output.
no code implementations • 10 May 2019 • Yuxiu Hua, Rongpeng Li, Zhifeng Zhao, Xianfu Chen, Honggang Zhang
Moreover, we further develop Dueling GAN-DDQN, which uses a specially designed dueling generator, to learn the action-value distribution by estimating the state-value distribution and the action advantage function.
Distributional Reinforcement Learning Generative Adversarial Network +4
no code implementations • 24 Oct 2018 • Yuxiu Hua, Zhifeng Zhao, Rongpeng Li, Xianfu Chen, Zhiming Liu, Honggang Zhang
Time series prediction can be generalized as a process that extracts useful information from historical records and then determines future values.
no code implementations • 8 Nov 2017 • Yuxiu Hua, Zhifeng Zhao, Rongpeng Li, Xianfu Chen, Zhiming Liu, Honggang Zhang
So, the RCLSTM, with certain intrinsic sparsity, have many neural connections absent (distinguished from the full connectivity) and which leads to the reduction of the parameters to be trained and the computational cost.