no code implementations • 2 Jul 2023 • Rémy Chaput, Olivier Boissier, Mathieu Guillermin
In this paper, we present two algorithms, named QSOM and QDSOM, which are able to adapt to changes in the environment, and especially in the reward function, which represents the ethical considerations that we want these systems to be aligned with.