1 code implementation • 5 Jun 2023 • Sam Lobel, Akhil Bagaria, George Konidaris
We propose a new method for count-based exploration in high-dimensional state spaces.
no code implementations • 23 Oct 2021 • Omer Gottesman, Kavosh Asadi, Cameron Allen, Sam Lobel, George Konidaris, Michael Littman
We propose a new coarse-grained smoothness definition that generalizes the notion of Lipschitz continuity, is more widely applicable, and allows us to compute significantly tighter bounds on Q-functions, leading to improved learning.
1 code implementation • 10 Jun 2019 • Sam Lobel, Chunyuan Li, Jianfeng Gao, Lawrence Carin
In this paper we investigate new methods for training collaborative filtering models based on actor-critic reinforcement learning, to directly optimize the non-differentiable quality metrics of interest.
Ranked #4 on Recommendation Systems on Million Song Dataset