Search Results for author: Dong Zheng

Found 12 papers, 7 papers with code

KuaiSim: A Comprehensive Simulator for Recommender Systems

1 code implementation • NeurIPS 2023 • Kesen Zhao, Shuchang Liu, Qingpeng Cai, Xiangyu Zhao, Ziru Liu, Dong Zheng, Peng Jiang, Kun Gai

For each task, KuaiSim also provides evaluation protocols and baseline recommendation algorithms that further serve as benchmarks for future research.

Reinforcement Learning (RL) Sequential Recommendation

Paper
Code

State Regularized Policy Optimization on Data with Dynamics Shift

no code implementations • NeurIPS 2023 • Zhenghai Xue, Qingpeng Cai, Shuchang Liu, Dong Zheng, Peng Jiang, Kun Gai, Bo An

Data with dynamics shift are separated according to their environment parameters to train the corresponding policy.

Offline RL Reinforcement Learning (RL)

Paper
Add Code

Generative Flow Network for Listwise Recommendation

1 code implementation • 4 Jun 2023 • Shuchang Liu, Qingpeng Cai, Zhankui He, Bowen Sun, Julian McAuley, Dong Zheng, Peng Jiang, Kun Gai

In this work, we aim to learn a policy that can generate sufficiently diverse item lists for users while maintaining high recommendation quality.

Recommendation Systems

Paper
Code

Multi-Task Recommendations with Reinforcement Learning

1 code implementation • 7 Feb 2023 • Ziru Liu, Jiejie Tian, Qingpeng Cai, Xiangyu Zhao, Jingtong Gao, Shuchang Liu, Dayou Chen, Tonghao He, Dong Zheng, Peng Jiang, Kun Gai

To be specific, the RMTL structure can address the two aforementioned issues by (i) constructing an MTL environment from session-wise interactions and (ii) training multi-task actor-critic network structure, which is compatible with most existing MTL-based recommendation models, and (iii) optimizing and fine-tuning the MTL loss function using the weights generated by critic networks.

Multi-Task Learning Recommendation Systems +2

Paper
Code

Exploration and Regularization of the Latent Action Space in Recommendation

1 code implementation • 7 Feb 2023 • Shuchang Liu, Qingpeng Cai, Bowen Sun, Yuhao Wang, Ji Jiang, Dong Zheng, Kun Gai, Peng Jiang, Xiangyu Zhao, Yongfeng Zhang

To overcome this challenge, we propose a hyper-actor and critic learning framework where the policy decomposes the item list generation process into a hyper-action inference step and an effect-action selection step.

Recommendation Systems

Paper
Code

Reinforcing User Retention in a Billion Scale Short Video Recommender System

no code implementations • 3 Feb 2023 • Qingpeng Cai, Shuchang Liu, Xueliang Wang, Tianyou Zuo, Wentao Xie, Bin Yang, Dong Zheng, Peng Jiang, Kun Gai

In this paper, we choose reinforcement learning methods to optimize the retention as they are designed to maximize the long-term performance.

Recommendation Systems reinforcement-learning +1

Paper
Add Code

Two-Stage Constrained Actor-Critic for Short Video Recommendation

1 code implementation • 3 Feb 2023 • Qingpeng Cai, Zhenghai Xue, Chi Zhang, Wanqi Xue, Shuchang Liu, Ruohan Zhan, Xueliang Wang, Tianyou Zuo, Wentao Xie, Dong Zheng, Peng Jiang, Kun Gai

One the one hand, the platforms aims at optimizing the users' cumulative watch time (main goal) in long term, which can be effectively optimized by Reinforcement Learning.

Recommendation Systems reinforcement-learning +2

Paper
Code

PrefRec: Recommender Systems with Human Preferences for Reinforcing Long-term User Engagement

1 code implementation • 6 Dec 2022 • Wanqi Xue, Qingpeng Cai, Zhenghai Xue, Shuo Sun, Shuchang Liu, Dong Zheng, Peng Jiang, Kun Gai, Bo An

Though promising, the application of RL heavily relies on well-designed rewards, but designing rewards related to long-term user engagement is quite difficult.

Recommendation Systems Reinforcement Learning (RL)

Paper
Code

Deconfounding Duration Bias in Watch-time Prediction for Video Recommendation

no code implementations • 13 Jun 2022 • Ruohan Zhan, Changhua Pei, Qiang Su, Jianfeng Wen, Xueliang Wang, Guanyu Mu, Dong Zheng, Peng Jiang

We employ a causal graph illuminating that duration is a confounding factor that concurrently affects video exposure and watch-time prediction -- the first effect on video causes the bias issue and should be eliminated, while the second effect on watch time originates from video intrinsic characteristics and should be preserved.

Paper
Add Code

ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor

1 code implementation • 1 Jun 2022 • Wanqi Xue, Qingpeng Cai, Ruohan Zhan, Dong Zheng, Peng Jiang, Kun Gai, Bo An

Meanwhile, reinforcement learning (RL) is widely regarded as a promising framework for optimizing long-term engagement in sequential recommendation.

Reinforcement Learning (RL) Sequential Recommendation

Paper
Code

Constrained Reinforcement Learning for Short Video Recommendation

no code implementations • 26 May 2022 • Qingpeng Cai, Ruohan Zhan, Chi Zhang, Jie Zheng, Guangwei Ding, Pinghua Gong, Dong Zheng, Peng Jiang

In this paper, we formulate the problem of short video recommendation as a constrained Markov Decision Process (MDP), where platforms want to optimize the main goal of user watch time in long term, with the constraint of accommodating the auxiliary responses of user interactions such as sharing/downloading videos.

Recommendation Systems reinforcement-learning +1

Paper
Add Code

PASTO: Strategic Parameter Optimization in Recommendation Systems -- Probabilistic is Better than Deterministic

no code implementations • 20 Aug 2021 • Weicong Ding, Hanlin Tang, Jingshuo Feng, Lei Yuan, Sen yang, Guangxu Yang, Jie Zheng, Jing Wang, Qiang Su, Dong Zheng, Xuezhong Qiu, Yongqi Liu, Yuxuan Chen, Yang Liu, Chao Song, Dongying Kong, Kai Ren, Peng Jiang, Qiao Lian, Ji Liu

In this setting with multiple and constrained goals, this paper discovers that a probabilistic strategic parameter regime can achieve better value compared to the standard regime of finding a single deterministic parameter.

Recommendation Systems

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.