Obstacle Tower is a high fidelity, 3D, 3rd person, procedurally generated environment for reinforcement learning. An agent playing Obstacle Tower must learn to solve both low-level control and high-level planning problems in tandem while learning from pixels and a sparse reward signal. Unlike other benchmarks such as the Arcade Learning Environment, evaluation of agent performance in Obstacle Tower is based on an agent’s ability to perform well on unseen instances of the environment.
19 PAPERS • 6 BENCHMARKS
POPGym is designed to benchmark memory in deep reinforcement learning. It contains a set of environments and a collection of memory model baselines. The environments are all Partially Observable Markov Decision Process (POMDP) environments following the Openai Gym interface. Our environments follow a few basic tenets:
2 PAPERS • 1 BENCHMARK
The bipedal skills benchmark is a suite of reinforcement learning environments implemented for the MuJoCo physics simulator. It aims to provide a set of tasks that demand a variety of motor skills beyond locomotion, and is intended for evaluating skill discovery and hierarchical learning methods. The majority of tasks exhibit a sparse reward structure.
2 PAPERS • NO BENCHMARKS YET