1 code implementation • 12 Jun 2021 • Shai Keynan, Elad Sarafian, Sarit Kraus
In particular, the input of the Q-function is both the state and the action, and in multi-task problems (Meta-RL) the policy can take a state and a context.
reinforcement-learning Reinforcement Learning (RL)