Reinforcement Learning with Dynamic Boltzmann Softmax Updates

14 Mar 2019 Ling Pan Qingpeng Cai Qi Meng Wei Chen Longbo Huang Tie-Yan Liu

Value function estimation is an important task in reinforcement learning, i.e., prediction. The Boltzmann softmax operator is a natural value estimator and can provide several benefits... (read more)

PDF Abstract

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper