Interpretable Reinforcement Learning With Neural Symbolic Logic

1 Jan 2021 · Zhihao Ma, Yuzheng Zhuang, Paul Weng, Dong Li, Kun Shao, Wulong Liu, Hankz Hankui Zhuo, Jianye Hao ·

Recent progress in deep reinforcement learning (DRL) can be largely attributed to the use of neural networks. However, this black-box approach fails to explain the learned policy in a human understandable way. To address this challenge and improve the transparency, we introduce symbolic logic into DRL and propose a Neural Symbolic Reinforcement Learning framework, in which states and actions are represented in an interpretable way using first-order logic. This framework features a relational reasoning module, which performs on task-level in Hierarchical Reinforcement Learning, enabling end-to-end learning with prior symbolic knowledge. Moreover, interpretability is enabled by extracting the logical rules learned by the reasoning module in a symbolic rule space, providing explainability on task level. Experimental results demonstrate better interpretability of subtasks, along with competing performance compared with existing approaches.

PDF Abstract