OpenAI Gym
175 papers with code • 17 benchmarks • 3 datasets
An open-source toolkit from OpenAI that implements several Reinforcement Learning benchmarks including: classic control, Atari, Robotics and MuJoCo tasks.
(Description by Evolutionary learning of interpretable decision trees)
(Image Credit: OpenAI Gym)
Libraries
Use these libraries to find OpenAI Gym models and implementationsMost implemented papers
Proximal Policy Optimization Algorithms
We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent.
Continuous control with deep reinforcement learning
We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain.
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
A platform for Applied Reinforcement Learning (Applied RL)
Addressing Function Approximation Error in Actor-Critic Methods
In value-based reinforcement learning methods such as deep Q-learning, function approximation errors are known to lead to overestimated value estimates and suboptimal policies.
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
The purpose of this technical report is two-fold.
Decision Transformer: Reinforcement Learning via Sequence Modeling
In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling.
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning
In this paper, we aim to develop a simple and scalable reinforcement learning algorithm that uses standard supervised learning methods as subroutines.
Deep Recurrent Q-Learning for Partially Observable MDPs
Deep Reinforcement Learning has yielded proficient controllers for complex tasks.
Deep Reinforcement Learning for Playing 2.5D Fighting Games
Deep reinforcement learning has shown its success in game playing.
Comparing the Efficacy of Fine-Tuning and Meta-Learning for Few-Shot Policy Imitation
Despite its simplicity this baseline is competitive with meta-learning methods on a variety of conditions and is able to imitate target policies trained on unseen variations of the original environment.