OpenAI Gym

135 papers with code • 9 benchmarks • 2 datasets

An open-source toolkit from OpenAI that implements several Reinforcement Learning benchmarks including: classic control, Atari, Robotics and MuJoCo tasks.

(Description by Evolutionary learning of interpretable decision trees)

(Image Credit: OpenAI Gym)

Libraries

Use these libraries to find OpenAI Gym models and implementations

Subtasks


Most implemented papers

Addressing Function Approximation Error in Actor-Critic Methods

sfujim/TD3 ICML 2018

In value-based reinforcement learning methods such as deep Q-learning, function approximation errors are known to lead to overestimated value estimates and suboptimal policies.

Decision Transformer: Reinforcement Learning via Sequence Modeling

kzl/decision-transformer NeurIPS 2021

In particular, we present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling.

Deep Recurrent Q-Learning for Partially Observable MDPs

marload/DeepRL-TensorFlow2 23 Jul 2015

Deep Reinforcement Learning has yielded proficient controllers for complex tasks.

Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning

google/trax 1 Oct 2019

In this paper, we aim to develop a simple and scalable reinforcement learning algorithm that uses standard supervised learning methods as subroutines.

Deep Reinforcement Learning for Playing 2.5D Fighting Games

elvisyjlin/lf2gym 5 May 2018

Deep reinforcement learning has shown its success in game playing.

Maximum Entropy-Regularized Multi-Goal Reinforcement Learning

ruizhaogit/mep 21 May 2019

This objective encourages the agent to maximize the expected return, as well as to achieve more diverse goals.

TorchBeast: A PyTorch Platform for Distributed RL

heiner/scalable_agent 8 Oct 2019

TorchBeast is a platform for reinforcement learning (RL) research in PyTorch.

Implicit Distributional Reinforcement Learning

zhougroup/IDAC NeurIPS 2020

To improve the sample efficiency of policy-gradient based reinforcement learning algorithms, we propose implicit distributional actor-critic (IDAC) that consists of a distributional critic, built on two deep generator networks (DGNs), and a semi-implicit actor (SIA), powered by a flexible policy distribution.

A Benchmark Environment Motivated by Industrial Control Problems

siemens/industrialbenchmark 27 Sep 2017

On one hand, these benchmarks are designed to provide interpretable RL training scenarios and detailed insight into the learning process of the method on hand.