no code implementations • 6 Mar 2020 • P Sharoff, Nishant A. Mehta, Ravi Ganti
We consider a sequential decision-making problem where an agent can take one action at a time and each action has a stochastic temporal extent, i. e., a new action cannot be taken until the previous one is finished.