no code implementations • 25 Jun 2024 • Mahdi Al-Husseini, Kyle Wray, Mykel Kochenderfer
Initial attack surveillance and suppression models have linked action spaces and objectives, making their optimization computationally challenging.
1 code implementation • 14 Feb 2024 • Harrison Delecki, Marcell Vazquez-Chanlatte, Esen Yel, Kyle Wray, Tomer Arnon, Stefan Witwicki, Mykel J. Kochenderfer
However, model-based planners may be brittle under these types of uncertainty because they rely on an exact model and tend to commit to a single optimal behavior.
1 code implementation • 6 Jan 2024 • Ava Pettet, Yunuo Zhang, Baiting Luo, Kyle Wray, Hendrik Baier, Aron Laszka, Abhishek Dubey, Ayan Mukhopadhyay
In this paper, we introduce \textit{Policy-Augmented Monte Carlo tree search} (PA-MCTS), which combines action-value estimates from an out-of-date policy with an online search using an up-to-date model of the environment.
no code implementations • 23 Oct 2023 • Rachel Freedman, Justin Svegliato, Kyle Wray, Stuart Russell
The HUB framework and ATS algorithm demonstrate the importance of leveraging differences between teachers to learn accurate reward models, facilitating future research on active teacher selection for robust reward modeling.