However, current research has been limited to interactions that offer actionable advice to only the current state of the agent.
EXplainable RL (XRL) is relatively recent field of research that aims to develop techniques to extract concepts from the agent's: perception of the environment; intrinsic/extrinsic motivations/beliefs; Q-values, goals and objectives.
Explainable reinforcement learning allows artificial agents to explain their behavior in a human-like manner aiming at non-expert end-users.
Over the last few years there has been rapid research growth into eXplainable Artificial Intelligence (XAI) and the closely aligned Interpretable Machine Learning (IML).
no code implementations • 17 Mar 2021 • Conor F. Hayes, Roxana Rădulescu, Eugenio Bargiacchi, Johan Källström, Matthew Macfarlane, Mathieu Reymond, Timothy Verstraeten, Luisa M. Zintgraf, Richard Dazeley, Fredrik Heintz, Enda Howley, Athirai A. Irissappane, Patrick Mannion, Ann Nowé, Gabriel Ramos, Marcello Restelli, Peter Vamplew, Diederik M. Roijers
Real-world decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives.
Interactive reinforcement learning has allowed speeding up the learning process in autonomous agents by including a human trainer providing extra information to the agent in real-time.
To address this issue, interactive reinforcement learning proposes the use of externally-sourced information in order to speed up the learning process.
We compare three different learning methods using a simulated robotic arm for the task of organizing different objects; the proposed methods are (i) deep reinforcement learning (DeepRL); (ii) interactive deep reinforcement learning using a previously trained artificial agent as an advisor (agent-IDeepRL); and (iii) interactive deep reinforcement learning using a human advisor (human-IDeepRL).
In this work, while reviewing externally-influenced methods, we propose a conceptual framework and taxonomy for assisted reinforcement learning, aimed at fostering collaboration by classifying and comparing various methods that use external information in the learning process.
As a way to explain the goal-driven robot's actions, we use the probability of success computed by three different proposed approaches: memory-based, learning-based, and introspection-based.
It then selects from each discrete state an input value and the action with the highest numerical preference as an input/target pair.
We report a previously unidentified issue with model-free, value-based approaches to multiobjective reinforcement learning in the context of environments with stochastic state transitions.
This paper introduces a new scalable multi-objective deep reinforcement learning (MODRL) framework based on deep Q-networks.