no code implementations • 25 Mar 2024 • Titouan Renard, Andreas Schlaginhaufen, Tingting Ni, Maryam Kamgarpour
Furthermore, with $\mathcal{O}(1/\varepsilon^{4})$ samples we prove that the optimal policy corresponding to the recovered reward is $\varepsilon$-close to the expert policy in total variation distance.
1 code implementation • 1 Jun 2023 • Andreas Schlaginhaufen, Maryam Kamgarpour
Two main challenges in Reinforcement Learning (RL) are designing appropriate reward functions and ensuring the safety of the learned policy.
1 code implementation • NeurIPS 2021 • Andreas Schlaginhaufen, Philippe Wenk, Andreas Krause, Florian Dörfler
To this end, neural ODEs regularized with neural Lyapunov functions are a promising approach when states are fully observed.