Distributional Reinforcement Learning

31 papers with code • 0 benchmarks • 0 datasets

Value distribution is the distribution of the random return received by a reinforcement learning agent. it been used for a specific purpose such as implementing risk-aware behaviour.

We have random return Z whose expectation is the value Q. This random return is also described by a recursive equation, but one of a distributional nature

Latest papers with no code

Beyond Average Return in Markov Decision Processes

no code yet • NeurIPS 2023

What are the functionals of the reward that can be computed and optimized exactly in Markov Decision Processes? In the finite-horizon, undiscounted setting, Dynamic Programming (DP) can only handle these operations efficiently for certain classes of statistics.

Pitfall of Optimism: Distributional Reinforcement Learning by Randomizing Risk Criterion

no code yet • NeurIPS 2023

Distributional reinforcement learning algorithms have attempted to utilize estimated uncertainty for exploration, such as optimism in the face of uncertainty.

Distributional Reinforcement Learning with Online Risk-awareness Adaption

no code yet • 8 Oct 2023

The use of reinforcement learning (RL) in practical applications requires considering sub-optimal outcomes, which depend on the agent's familiarity with the uncertain environment.

Learning Risk-Aware Quadrupedal Locomotion using Distributional Reinforcement Learning

no code yet • 25 Sep 2023

Instead of relying on a value expectation, we estimate the complete value distribution to account for uncertainty in the robot's interaction with the environment.

Deep Reinforcement Learning for Artificial Upwelling Energy Management

no code yet • 20 Aug 2023

The potential of artificial upwelling (AU) as a means of lifting nutrient-rich bottom water to the surface, stimulating seaweed growth, and consequently enhancing ocean carbon sequestration, has been gaining increasing attention in recent years.

Value-Distributional Model-Based Reinforcement Learning

no code yet • 12 Aug 2023

We study the problem from a model-based Bayesian reinforcement learning perspective, where the goal is to learn the posterior distribution over value functions induced by parameter (epistemic) uncertainty of the Markov decision process.

Cramer Type Distances for Learning Gaussian Mixture Models by Gradient Descent

no code yet • 13 Jul 2023

Even fewer algorithms are compatible with gradient descent, the common learning process for neural networks.

Is Risk-Sensitive Reinforcement Learning Properly Resolved?

no code yet • 2 Jul 2023

Due to the nature of risk management in learning applicable policies, risk-sensitive reinforcement learning (RSRL) has been realized as an important direction.

Diverse Projection Ensembles for Distributional Reinforcement Learning

no code yet • 12 Jun 2023

In contrast to classical reinforcement learning, distributional reinforcement learning algorithms aim to learn the distribution of returns rather than their expected value.

PACER: A Fully Push-forward-based Distributional Reinforcement Learning Algorithm

no code yet • 11 Jun 2023

In this paper, we propose the first fully push-forward-based Distributional Reinforcement Learning algorithm, called Push-forward-based Actor-Critic EncourageR (PACER).