1 code implementation • 30 Sep 2022 • Siddhartha Banerjee, Sean R. Sinclair, Milind Tambe, Lily Xu, Christina Lee Yu
How best to incorporate historical data to "warm start" bandit algorithms is an open question: naively initializing reward estimates using all historical samples can suffer from spurious data and imbalanced data coverage, leading to computational and storage issues $\unicode{x2014}$ particularly salient in continuous action spaces.
1 code implementation • 13 Jul 2022 • Sean R. Sinclair, Felipe Frujeri, Ching-An Cheng, Luke Marshall, Hugo Barbalho, Jingling Li, Jennifer Neville, Ishai Menache, Adith Swaminathan
Many resource management problems require sequential decision-making under uncertainty, where the only uncertainty affecting the decision outcomes are exogenous variables outside the control of the decision-maker.
no code implementations • 29 Oct 2021 • Sean R. Sinclair, Siddhartha Banerjee, Christina Lee Yu
In this paper we provide a unified theoretical analysis of tree-based hierarchical partitioning methods for online reinforcement learning, providing model-free and model-based algorithms.
1 code implementation • NeurIPS 2020 • Sean R. Sinclair, Tianyu Wang, Gauri Jain, Siddhartha Banerjee, Christina Lee Yu
We introduce the technique of adaptive discretization to design an efficient model-based episodic reinforcement learning algorithm in large (potentially continuous) state-action spaces.
Model-based Reinforcement Learning reinforcement-learning +1
1 code implementation • 17 Oct 2019 • Sean R. Sinclair, Siddhartha Banerjee, Christina Lee Yu
We present an efficient algorithm for model-free episodic reinforcement learning on large (potentially continuous) state-action spaces.