Search Results

Distributed Distributional Deterministic Policy Gradients

2 code implementations ICLR 2018

This work adopts the very successful distributional perspective on reinforcement learning and adapts it to the continuous control setting.

Continuous Control

Continuous control with deep reinforcement learning

132 code implementations9 Sep 2015

We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain.

Continuous Control Q-Learning

Memory Augmented Policy Optimization for Program Synthesis and Semantic Parsing

4 code implementations NeurIPS 2018

We present Memory Augmented Policy Optimization (MAPO), a simple and novel way to leverage a memory buffer of promising trajectories to reduce the variance of policy gradient estimate.

Combinatorial Optimization Program Synthesis +2

Action Branching Architectures for Deep Reinforcement Learning

5 code implementations24 Nov 2017

This approach achieves a linear increase of the number of network outputs with the number of degrees of freedom by allowing a level of independence for each individual action dimension.

Continuous Control General Reinforcement Learning

Shapley Q-value: A Local Reward Approach to Solve Global Reward Games

1 code implementation11 Jul 2019

To deal with this problem, we i) introduce a cooperative-game theoretical framework called extended convex game (ECG) that is a superset of global reward game, and ii) propose a local reward approach called Shapley Q-value.

Multi-agent Reinforcement Learning Policy Gradient Methods

Randomized Value Functions via Multiplicative Normalizing Flows

1 code implementation6 Jun 2018

In particular, we augment DQN and DDPG with multiplicative normalizing flows in order to track a rich approximate posterior distribution over the parameters of the value function.

Efficient Exploration

Maximum Entropy-Regularized Multi-Goal Reinforcement Learning

2 code implementations21 May 2019

This objective encourages the agent to maximize the expected return, as well as to achieve more diverse goals.

Multi-Goal Reinforcement Learning OpenAI Gym

Power Allocation in Multi-User Cellular Networks: Deep Reinforcement Learning Approaches

1 code implementation22 Jan 2019

The model-based power allocation algorithm has been investigated for decades, but it requires the mathematical models to be analytically tractable and it usually has high computational complexity.

Information Theory Information Theory

Bayesian Policy Gradients via Alpha Divergence Dropout Inference

1 code implementation6 Dec 2017

Policy gradient methods have had great success in solving continuous control tasks, yet the stochastic nature of such problems makes deterministic value estimation difficult.

Continuous Control Policy Gradient Methods

Deep Actor-Critic Learning for Distributed Power Control in Wireless Mobile Networks

1 code implementation14 Sep 2020

Deep reinforcement learning offers a model-free alternative to supervised deep learning and classical optimization for solving the transmit power control problem in wireless networks.