no code implementations • ICML 2020 • Elad Sarafian, Mor Sinay, yoram louzoun, Noa Agmon, Sarit Kraus
We prove the convergence of EGL to a stationary point and its robustness in the optimization of integrable functions.
2 code implementations • 29 Apr 2023 • Gideon Freund, Elad Sarafian, Sarit Kraus
In reinforcement learning and imitation learning, an object of central importance is the state distribution induced by the policy.
1 code implementation • 12 Jun 2021 • Shai Keynan, Elad Sarafian, Sarit Kraus
In particular, the input of the Q-function is both the state and the action, and in multi-task problems (Meta-RL) the policy can take a state and a context.
no code implementations • 9 Jun 2020 • Mor Sinay, Elad Sarafian, yoram louzoun, Noa Agmon, Sarit Kraus
Instead of fitting the function, EGL trains a NN to estimate the objective gradient directly.
no code implementations • 27 Sep 2018 • Elad Sarafian, Aviv Tamar, Sarit Kraus
The primary advantages of our approach, termed Rerouted Behavior Improvement (RBI), over other safe learning methods are its stability in the presence of value estimation errors and the elimination of a policy search process.
1 code implementation • 20 May 2018 • Elad Sarafian, Aviv Tamar, Sarit Kraus
To minimize the improvement penalty, the RBI idea is to attenuate rapid policy changes of low probability actions which were less frequently sampled.