no code implementations • 18 Jun 2016 • Naoto Yoshida
In this paper, we generalize the formulation of previous research related to the survival of an agent and we formulate the survival problem as a maximization of the multi-step survival probability in future time steps.
no code implementations • 4 Dec 2015 • Naoto Yoshida
We suggest an effective architecture of the neural networks for approximating an action-value function with binary vector actions.