Distributional Reinforcement Learning
26 papers with code • 0 benchmarks • 0 datasets
Value distribution is the distribution of the random return received by a reinforcement learning agent. it been used for a specific purpose such as implementing risk-aware behaviour.
We have random return Z whose expectation is the value Q. This random return is also described by a recursive equation, but one of a distributional nature
Benchmarks
These leaderboards are used to track progress in Distributional Reinforcement Learning
Most implemented papers
Implicit Quantile Networks for Distributional Reinforcement Learning
In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN.
Distributional Reinforcement Learning with Quantile Regression
In this paper, we build on recent work advocating a distributional approach to reinforcement learning in which the distribution over returns is modeled explicitly instead of only estimating the mean.
Fully Parameterized Quantile Function for Distributional Reinforcement Learning
The key challenge in practical distributional RL algorithms lies in how to parameterize estimated distributions so as to better approximate the true continuous distribution.
QUOTA: The Quantile Option Architecture for Reinforcement Learning
In this paper, we propose the Quantile Option Architecture (QUOTA) for exploration based on recent advances in distributional reinforcement learning (RL).
Implicit Distributional Reinforcement Learning
To improve the sample efficiency of policy-gradient based reinforcement learning algorithms, we propose implicit distributional actor-critic (IDAC) that consists of a distributional critic, built on two deep generator networks (DGNs), and a semi-implicit actor (SIA), powered by a flexible policy distribution.
Estimating Risk and Uncertainty in Deep Reinforcement Learning
Reinforcement learning agents are faced with two types of uncertainty.
GAN Q-learning
Distributional reinforcement learning (distributional RL) has seen empirical success in complex Markov Decision Processes (MDPs) in the setting of nonlinear function approximation.
Information-Directed Exploration for Deep Reinforcement Learning
Efficient exploration remains a major challenge for reinforcement learning.
Distributional Reinforcement Learning for Energy-Based Sequential Models
Global Autoregressive Models (GAMs) are a recent proposal [Parshakova et al., CoNLL 2019] for exploiting global properties of sequences for data-efficient learning of seq2seq models.
Distributional Reinforcement Learning via Moment Matching
We consider the problem of learning a set of probability distributions from the empirical Bellman dynamics in distributional reinforcement learning (RL), a class of state-of-the-art methods that estimate the distribution, as opposed to only the expectation, of the total return.