TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
OpenAI Gym	Cart Pole (OpenAI Gym)	Orthogonal decision tree	Average Return	500	# 1
OpenAI Gym	Cart Pole (OpenAI Gym)	Oblique decision tree	Average Return	500	# 1
OpenAI Gym	CartPole-v1	Orthogonal decision tree	Average Return	500	# 1
OpenAI Gym	CartPole-v1	Oblique decision tree	Average Return	500	# 1
OpenAI Gym	LunarLander-v2	Oblique decision tree	Average Return	272.14	# 1
OpenAI Gym	Mountain Car	Oblique decision tree	Average Return	-106.02	# 1
OpenAI Gym	Mountain Car	Orthogonal decision tree	Average Return	-101.72	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/evolutionary-learning-of-interpretable/openai-gym-on-cart-pole-openai-gym)](https://paperswithcode.com/sota/openai-gym-on-cart-pole-openai-gym?p=evolutionary-learning-of-interpretable)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/evolutionary-learning-of-interpretable/openai-gym-on-cartpole-v1)](https://paperswithcode.com/sota/openai-gym-on-cartpole-v1?p=evolutionary-learning-of-interpretable)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/evolutionary-learning-of-interpretable/openai-gym-on-lunarlander-v2)](https://paperswithcode.com/sota/openai-gym-on-lunarlander-v2?p=evolutionary-learning-of-interpretable)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/evolutionary-learning-of-interpretable/openai-gym-on-mountain-car)](https://paperswithcode.com/sota/openai-gym-on-mountain-car?p=evolutionary-learning-of-interpretable)`

Evolutionary learning of interpretable decision trees

14 Dec 2020 · Leonardo Lucio Custode, Giovanni Iacca ·

Reinforcement learning techniques achieved human-level performance in several tasks in the last decade. However, in recent years, the need for interpretability emerged: we want to be able to understand how a system works and the reasons behind its decisions. Not only we need interpretability to assess the safety of the produced systems, we also need it to extract knowledge about unknown problems. While some techniques that optimize decision trees for reinforcement learning do exist, they usually employ greedy algorithms or they do not exploit the rewards given by the environment. This means that these techniques may easily get stuck in local optima. In this work, we propose a novel approach to interpretable reinforcement learning that uses decision trees. We present a two-level optimization scheme that combines the advantages of evolutionary algorithms with the advantages of Q-learning. This way we decompose the problem into two sub-problems: the problem of finding a meaningful and useful decomposition of the state space, and the problem of associating an action to each state. We test the proposed method on three well-known reinforcement learning benchmarks, on which it results competitive with respect to the state-of-the-art in both performance and interpretability. Finally, we perform an ablation study that confirms that using the two-level optimization scheme gives a boost in performance in non-trivial environments with respect to a one-layer optimization technique.

PDF Abstract

Code

Add Remove Mark official

leocus/ge_q_dts official

Tasks

Add Remove

Evolutionary Algorithms

OpenAI Gym

reinforcement-learning

Reinforcement Learning (RL)

Datasets

OpenAI Gym

Results from the Paper

Edit

Ranked #1 on OpenAI Gym on Cart Pole (OpenAI Gym)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
OpenAI Gym	Cart Pole (OpenAI Gym)	Orthogonal decision tree	Average Return	500	# 1	Compare
OpenAI Gym	Cart Pole (OpenAI Gym)	Oblique decision tree	Average Return	500	# 1	Compare
OpenAI Gym	CartPole-v1	Orthogonal decision tree	Average Return	500	# 1	Compare
OpenAI Gym	CartPole-v1	Oblique decision tree	Average Return	500	# 1	Compare
OpenAI Gym	LunarLander-v2	Oblique decision tree	Average Return	272.14	# 1	Compare
OpenAI Gym	Mountain Car	Oblique decision tree	Average Return	-106.02	# 1	Compare
OpenAI Gym	Mountain Car	Orthogonal decision tree	Average Return	-101.72	# 1	Compare

Methods

Add Remove

Grammatical evolution + Q-learning

Edit Social Preview

Evolutionary learning of interpretable decision trees

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove