2 code implementations • 8 Nov 2021 • Kajetan Schweighofer, Markus Hofmarcher, Marius-Constantin Dinu, Philipp Renz, Angela Bitto-Nemling, Vihang Patil, Sepp Hochreiter
Algorithms that constrain the learned policy towards the given dataset perform well for datasets with high TQ or SACo.