no code implementations • 17 May 2022 • Andrew Patterson, Victor Liao, Martha White
We start from a formalization of robust losses, then derive sound gradient-based approaches to minimize these losses in both the online off-policy prediction and control settings.
1 code implementation • ICLR 2022 • Claas Voelcker, Victor Liao, Animesh Garg, Amir-Massoud Farahmand
However, they tend to be inferior in practice to commonly used maximum likelihood (MLE) based approaches.
Model-based Reinforcement Learning reinforcement-learning +1