Browse > Playing Games > Continuous Control

Continuous Control

44 papers with code · Playing Games

State-of-the-art leaderboards

No evaluation results yet. Help compare methods by submit evaluation metrics.

Greatest papers with code

Near-Optimal Representation Learning for Hierarchical Reinforcement Learning

ICLR 2019 tensorflow/models

We study the problem of representation learning in goal-conditioned hierarchical reinforcement learning. In such hierarchical structures, a higher-level controller solves tasks by iteratively communicating goals which a lower-level policy is trained to reach.

CONTINUOUS CONTROL HIERARCHICAL REINFORCEMENT LEARNING REPRESENTATION LEARNING

Parameter Space Noise for Exploration

ICLR 2018 tensorflow/models

Deep reinforcement learning (RL) methods generally engage in exploratory behavior through noise injection in the action space. Combining parameter noise with traditional RL methods allows to combine the best of both worlds.

CONTINUOUS CONTROL

Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research

26 Feb 2018openai/gym

The purpose of this technical report is two-fold. First of all, it introduces a suite of challenging continuous control tasks (integrated with OpenAI Gym) based on currently existing robotics hardware.

CONTINUOUS CONTROL MULTI-GOAL REINFORCEMENT LEARNING

Benchmarking Deep Reinforcement Learning for Continuous Control

22 Apr 2016openai/rllab

Recently, researchers have made significant progress combining the advances in deep learning for learning feature representations with reinforcement learning. Some notable examples include training agents to play Atari games based on raw pixel data and to acquire advanced manipulation skills using raw sensory inputs.

ATARI GAMES CONTINUOUS CONTROL

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

ICML 2018 facebookresearch/Horizon

Model-free deep reinforcement learning (RL) algorithms have been demonstrated on a range of challenging decision making and control tasks. In this paper, we propose soft actor-critic, an off-policy actor-critic deep RL algorithm based on the maximum entropy reinforcement learning framework.

CONTINUOUS CONTROL DECISION MAKING Q-LEARNING

Continuous control with deep reinforcement learning

9 Sep 2015facebookresearch/Horizon

We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain. We present an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces.

CONTINUOUS CONTROL Q-LEARNING

DeepMind Control Suite

2 Jan 2018deepmind/dm_control

The DeepMind Control Suite is a set of continuous control tasks with a standardised structure and interpretable rewards, intended to serve as performance benchmarks for reinforcement learning agents. The tasks are written in Python and powered by the MuJoCo physics engine, making them easy to use and modify.

CONTINUOUS CONTROL

Simple random search provides a competitive approach to reinforcement learning

19 Mar 2018modestyachts/ARS

A common belief in model-free reinforcement learning is that methods based on random search in the parameter space of policies exhibit significantly worse sample complexity than those that explore the space of actions. We dispel such beliefs by introducing a random search method for training static, linear policies for continuous control problems, matching state-of-the-art sample efficiency on the benchmark MuJoCo locomotion tasks.

CONTINUOUS CONTROL

VIME: Variational Information Maximizing Exploration

NeurIPS 2016 openai/vime

Scalable and effective exploration remains a key challenge in reinforcement learning (RL). While there are methods with optimality guarantees in the setting of discrete state and action spaces, these methods cannot be applied in high-dimensional deep RL scenarios.

CONTINUOUS CONTROL

High-Dimensional Continuous Control Using Generalized Advantage Estimation

8 Jun 2015pat-coady/trpo

Policy gradient methods are an appealing approach in reinforcement learning because they directly optimize the cumulative reward and can straightforwardly be used with nonlinear function approximators such as neural networks. The two main challenges are the large number of samples typically required, and the difficulty of obtaining stable and steady improvement despite the nonstationarity of the incoming data.

CONTINUOUS CONTROL POLICY GRADIENT METHODS