Search Results for author: Zhenghai Xue

Found 9 papers, 4 papers with code

AgentStudio: A Toolkit for Building General Virtual Agents

no code implementations • 26 Mar 2024 • Longtao Zheng, Zhiyuan Huang, Zhenghai Xue, Xinrun Wang, Bo An, Shuicheng Yan

We have open-sourced the environments, datasets, benchmarks, and interfaces to promote research towards developing general virtual agents for the future.

Visual Grounding

Paper
Add Code

AdaRec: Adaptive Sequential Recommendation for Reinforcing Long-term User Engagement

no code implementations • 6 Oct 2023 • Zhenghai Xue, Qingpeng Cai, Tianyou Zuo, Bin Yang, Lantao Hu, Peng Jiang, Kun Gai, Bo An

One challenge in large-scale online recommendation systems is the constant and complicated changes in users' behavior patterns, such as interaction rates and retention tendencies.

Reinforcement Learning (RL) Sequential Recommendation

Paper
Add Code

A Large Language Model Enhanced Conversational Recommender System

no code implementations • 11 Aug 2023 • Yue Feng, Shuchang Liu, Zhenghai Xue, Qingpeng Cai, Lantao Hu, Peng Jiang, Kun Gai, Fei Sun

For response generation, we utilize the generation ability of LLM as a language interface to better interact with users.

Language Modelling Large Language Model +2

Paper
Add Code

State Regularized Policy Optimization on Data with Dynamics Shift

no code implementations • NeurIPS 2023 • Zhenghai Xue, Qingpeng Cai, Shuchang Liu, Dong Zheng, Peng Jiang, Kun Gai, Bo An

Data with dynamics shift are separated according to their environment parameters to train the corresponding policy.

Offline RL Reinforcement Learning (RL)

Paper
Add Code

Guarded Policy Optimization with Imperfect Online Demonstrations

no code implementations • 3 Mar 2023 • Zhenghai Xue, Zhenghao Peng, Quanyi Li, Zhihan Liu, Bolei Zhou

Assuming optimal, the teacher policy has the perfect timing and capability to intervene in the learning process of the student agent, providing safety guarantee and exploration guidance.

Continuous Control Efficient Exploration +2

Paper
Add Code

Two-Stage Constrained Actor-Critic for Short Video Recommendation

1 code implementation • 3 Feb 2023 • Qingpeng Cai, Zhenghai Xue, Chi Zhang, Wanqi Xue, Shuchang Liu, Ruohan Zhan, Xueliang Wang, Tianyou Zuo, Wentao Xie, Dong Zheng, Peng Jiang, Kun Gai

One the one hand, the platforms aims at optimizing the users' cumulative watch time (main goal) in long term, which can be effectively optimized by Reinforcement Learning.

Recommendation Systems reinforcement-learning +2

Paper
Code

PrefRec: Recommender Systems with Human Preferences for Reinforcing Long-term User Engagement

1 code implementation • 6 Dec 2022 • Wanqi Xue, Qingpeng Cai, Zhenghai Xue, Shuo Sun, Shuchang Liu, Dong Zheng, Peng Jiang, Kun Gai, Bo An

Though promising, the application of RL heavily relies on well-designed rewards, but designing rewards related to long-term user engagement is quite difficult.

Recommendation Systems Reinforcement Learning (RL)

Paper
Code

MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

2 code implementations • 26 Sep 2021 • Quanyi Li, Zhenghao Peng, Lan Feng, Qihang Zhang, Zhenghai Xue, Bolei Zhou

Based on MetaDrive, we construct a variety of RL tasks and baselines in both single-agent and multi-agent settings, including benchmarking generalizability across unseen scenes, safe exploration, and learning multi-agent traffic.

Benchmarking Decision Making +5

594

Paper
Code

Regret Minimization Experience Replay in Off-Policy Reinforcement Learning

1 code implementation • NeurIPS 2021 • Xu-Hui Liu, Zhenghai Xue, Jing-Cheng Pang, Shengyi Jiang, Feng Xu, Yang Yu

In reinforcement learning, experience replay stores past samples for further reuse.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.