no code implementations • 16 Sep 2022 • Diana M. Negoescu, Pasha Khosravi, Shadow Zhao, Nanyu Chen, Parvez Ahammad, Humberto Gonzalez
This opens questions regarding not only which decision-making policies would perform best in practice, but also regarding the impact of different data collection protocols on the performance of various policies trained on the data, or the robustness of policy performance with respect to changes in problem characteristics such as action- or reward- specific delays in observing outcomes.