no code implementations • 3 Nov 2022 • Gal Leibovich, Guy Jacob, Or Avner, Gal Novik, Aviv Tamar
The key challenge is a $\textit{distribution shift}$ between the desired outputs and the outputs of an initial random guess, and we prove that iterative inversion can steer the learning correctly, under rather strict conditions on the function.
no code implementations • ICML 2020 • Tom Jurgenson, Or Avner, Edward Groshev, Aviv Tamar
Reinforcement learning (RL), building on Bellman's optimality equation, naturally optimizes for a single goal, yet can be made multi-goal by augmenting the state with the goal.