Distributional Reinforcement Learning
17 papers with code • 0 benchmarks • 0 datasets
Value distribution is the distribution of the random return received by a reinforcement learning agent. it been used for a specific purpose such as implementing risk-aware behaviour.
We have random return Z whose expectation is the value Q. This random return is also described by a recursive equation, but one of a distributional nature
These leaderboards are used to track progress in Distributional Reinforcement Learning
In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN.
In this paper, we build on recent work advocating a distributional approach to reinforcement learning in which the distribution over returns is modeled explicitly instead of only estimating the mean.
The key challenge in practical distributional RL algorithms lies in how to parameterize estimated distributions so as to better approximate the true continuous distribution.
In this paper, we propose the Quantile Option Architecture (QUOTA) for exploration based on recent advances in distributional reinforcement learning (RL).
To improve the sample efficiency of policy-gradient based reinforcement learning algorithms, we propose implicit distributional actor-critic (IDAC) that consists of a distributional critic, built on two deep generator networks (DGNs), and a semi-implicit actor (SIA), powered by a flexible policy distribution.
We consider the problem of learning a set of probability distributions from the empirical Bellman dynamics in distributional reinforcement learning (RL), a class of state-of-the-art methods that estimate the distribution, as opposed to only the expectation, of the total return.
Unifying Cardiovascular Modelling with Deep Reinforcement Learning for Uncertainty Aware Control of Sepsis Treatment
Sepsis is a potentially life threatening inflammatory response to infection or severe tissue damage.