1 code implementation • 14 May 2023 • Marcus Hoerger, Hanna Kurniawati, Dirk Kroese, Nan Ye
At each planning step, our method uses a novel lazy Cross-Entropy method to search the space of policy trees, which provide a simple policy representation.
1 code implementation • 13 May 2023 • Lachlan Gibson, Marcus Hoerger, Dirk Kroese
Solving decision problems in complex, stochastic environments is often achieved by estimating the expected outcome of decisions via Monte Carlo sampling.
1 code implementation • 21 Feb 2023 • Marcus Hoerger, Hanna Kurniawati, Dirk Kroese, Nan Ye
ADVT uses the estimated diameters of the cells to form an upper-confidence bound on the action value function within the cell, guiding the Monte Carlo Tree Search expansion and further discretization of the action space.
1 code implementation • 13 Sep 2022 • Marcus Hoerger, Hanna Kurniawati, Dirk Kroese, Nan Ye
A Voronoi tree is a Binary Space Partitioning (BSP) that implicitly maintains the partition of a cell as the Voronoi diagram of two points sampled from the cell.
no code implementations • 4 Nov 2020 • Marcus Hoerger, Hanna Kurniawati
Most on-line solvers rely on discretising the observation space or artificially limiting the number of observations that are considered during planning to compute tractable policies.