no code implementations • ICML 2020 • Tom Jurgenson, Or Avner, Edward Groshev, Aviv Tamar
Reinforcement learning (RL), building on Bellman's optimality equation, naturally optimizes for a single goal, yet can be made multi-goal by augmenting the state with the goal.
no code implementations • 12 Jun 2019 • Tom Jurgenson, Edward Groshev, Aviv Tamar
In such problems, the way we choose to represent a trajectory underlies algorithms for trajectory prediction and optimization.
no code implementations • 24 Aug 2017 • Edward Groshev, Maxwell Goldstein, Aviv Tamar, Siddharth Srivastava, Pieter Abbeel
We show that a deep neural network can be used to learn and represent a \emph{generalized reactive policy} (GRP) that maps a problem instance and a state to an action, and that the learned GRPs efficiently solve large classes of challenging problem instances.