continuous-control
424 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in continuous-control
Libraries
Use these libraries to find continuous-control models and implementationsMost implemented papers
Continuous control with deep reinforcement learning
We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain.
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
The purpose of this technical report is two-fold.
Simple random search provides a competitive approach to reinforcement learning
A common belief in model-free reinforcement learning is that methods based on random search in the parameter space of policies exhibit significantly worse sample complexity than those that explore the space of actions.
Conservative Q-Learning for Offline Reinforcement Learning
We theoretically show that CQL produces a lower bound on the value of the current policy and that it can be incorporated into a policy learning procedure with theoretical improvement guarantees.
High-Dimensional Continuous Control Using Generalized Advantage Estimation
Policy gradient methods are an appealing approach in reinforcement learning because they directly optimize the cumulative reward and can straightforwardly be used with nonlinear function approximators such as neural networks.
Benchmarking Deep Reinforcement Learning for Continuous Control
Recently, researchers have made significant progress combining the advances in deep learning for learning feature representations with reinforcement learning.
Parameter Space Noise for Exploration
Combining parameter noise with traditional RL methods allows to combine the best of both worlds.
Off-Policy Deep Reinforcement Learning without Exploration
Many practical applications of reinforcement learning constrain agents to learn from a fixed batch of data which has already been gathered, without offering further possibility for data collection.
Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics
The overestimation bias is one of the major impediments to accurate off-policy learning.
Learning Latent Dynamics for Planning from Pixels
Planning has been very successful for control tasks with known environment dynamics.