no code implementations • NeurIPS 2021 • Giorgia Ramponi, Alberto Maria Metelli, Alessandro Concetti, Marcello Restelli
This presupposes that the two actors have the same reward functions.