1 code implementation • ICLR 2020 • Sam Lobel*, Chunyuan Li*, Jianfeng Gao, Lawrence Carin
We investigate new methods for training collaborative filtering models based on actor-critic reinforcement learning, to more directly maximize ranking-based objective functions.