OpenAI Gym

160 papers with code • 9 benchmarks • 3 datasets

An open-source toolkit from OpenAI that implements several Reinforcement Learning benchmarks including: classic control, Atari, Robotics and MuJoCo tasks.

(Description by Evolutionary learning of interpretable decision trees)

(Image Credit: OpenAI Gym)

Benchmarks

Add a Result

These leaderboards are used to track progress in OpenAI Gym

Dataset	Best Model	Compare
LunarLander-v2	Oblique decision tree	See all
CartPole-v1	Orthogonal decision tree	See all
Mountain Car	Orthogonal decision tree	See all
Cart Pole (OpenAI Gym)	Oblique decision tree	See all
Ant-v2	AWR	See all
HalfCheetah-v2	AWR	See all
Hopper-v2	AWR	See all
Humanoid-v2	AWR	See all
Walker2d-v2	AWR	See all

Libraries

Use these libraries to find OpenAI Gym models and implementations

toni-sm/skrl

2 papers

405

Datasets

Subtasks

Acrobot

Latest papers with no code

Most implemented Social Latest No code

Bridging Dimensions: Confident Reachability for High-Dimensional Controllers

no code yet • 8 Nov 2023

Autonomous systems are increasingly implemented using end-to-end learning-based controllers.

Paper
Add Code

Neural architecture impact on identifying temporally extended Reinforcement Learning tasks

no code yet • 4 Oct 2023

In addition, motivated by recent developments in attention based video-classification models using Vision Transformer, we come up with an architecture based on Vision Transformer, for image-based RL domain too.

Paper
Add Code

Optimizing with Low Budgets: a Comparison on the Black-box Optimization Benchmarking Suite and OpenAI Gym

no code yet • 29 Sep 2023

BO-based algorithms are popular in the ML community, as they are used for hyperparameter optimization and more generally for algorithm configuration.

Paper
Add Code

Implicit Sensing in Traffic Optimization: Advanced Deep Reinforcement Learning Techniques

no code yet • 25 Sep 2023

The results unequivocally demonstrate that the DQN agent trained using the {\epsilon}-greedy policy significantly outperforms the one trained with the Boltzmann policy.

Paper
Add Code

gym-saturation: Gymnasium environments for saturation provers (System description)

no code yet • 16 Sep 2023

This work describes a new version of a previously published Python package - gym-saturation: a collection of OpenAI Gym environments for guiding saturation-style provers based on the given clause algorithm with reinforcement learning.

Paper
Add Code

Attention Loss Adjusted Prioritized Experience Replay

no code yet • 13 Sep 2023

Prioritized Experience Replay (PER) is a technical means of deep reinforcement learning by selecting experience samples with more knowledge quantity to improve the training rate of neural network.

Paper
Add Code

Distributionally Robust Statistical Verification with Imprecise Neural Networks

no code yet • 28 Aug 2023

A particularly challenging problem in AI safety is providing guarantees on the behavior of high-dimensional autonomous systems.

Paper
Add Code

Statistically Efficient Variance Reduction with Double Policy Estimation for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning

no code yet • 28 Aug 2023

Offline reinforcement learning aims to utilize datasets of previously gathered environment-action interaction records to learn a policy without access to the real environment.

Paper
Add Code

On Combining Expert Demonstrations in Imitation Learning via Optimal Transport

no code yet • 20 Jul 2023

One of the key approaches to IL is to define a distance between agent and expert and to find an agent policy that minimizes that distance.

Paper
Add Code

Scaling Distributed Multi-task Reinforcement Learning with Experience Sharing

no code yet • 11 Jul 2023

Our research demonstrates that to achieve $\epsilon$-optimal policies for all $M$ tasks, a single agent using DistMT-LSVI needs to run a total number of episodes that is at most $\tilde{\mathcal{O}}({d^3H^6(\epsilon^{-2}+c_{\rm sep}^{-2})}\cdot M/N)$, where $c_{\rm sep}>0$ is a constant representing task separability, $H$ is the horizon of each episode, and $d$ is the feature dimension of the dynamics and rewards.

Paper
Add Code

OpenAI Gym

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result