An Information-Theoretic Optimality Principle for Deep Reinforcement Learning

We methodologically address the problem of Q-value overestimation in deep reinforcement learning to handle high-dimensional state spaces efficiently. By adapting concepts from information theory, we introduce an intrinsic penalty signal encouraging reduced Q-value estimates... (read more)

Results in Papers With Code
(↓ scroll down to see all results)