Search Results for author: Xiong-Hui Chen

Found 11 papers, 4 papers with code

Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts

no code implementations • 14 Apr 2024 • Jing-Cheng Pang, Si-Hang Yang, Kaiyuan Li, Jiaji Zhang, Xiong-Hui Chen, Nan Tang, Yang Yu

Furthermore, KALM effectively enables the LLM to comprehend environmental dynamics, resulting in the generation of meaningful imaginary rollouts that reflect novel skills and demonstrate the seamless integration of large language models and reinforcement learning.

Language Modelling Large Language Model +2

Paper
Add Code

Imitator Learning: Achieve Out-of-the-Box Imitation Ability in Variable Environments

no code implementations • 9 Oct 2023 • Xiong-Hui Chen, Junyin Ye, Hang Zhao, Yi-Chen Li, Haoran Shi, Yu-Yan Xu, Zhihao Ye, Si-Hang Yang, Anqi Huang, Kai Xu, Zongzhang Zhang, Yang Yu

In this work, we focus on imitator learning based on only one expert demonstration.

Imitation Learning

Paper
Add Code

Language Model Self-improvement by Reinforcement Learning Contemplation

no code implementations • 23 May 2023 • Jing-Cheng Pang, Pengyuan Wang, Kaiyuan Li, Xiong-Hui Chen, Jiacheng Xu, Zongzhang Zhang, Yang Yu

We demonstrate that SIRLC can be applied to various NLP tasks, such as reasoning problems, text generation, and machine translation.

Language Modelling Machine Translation +3

Paper
Add Code

Sim2Rec: A Simulator-based Decision-making Approach to Optimize Real-World Long-term User Engagement in Sequential Recommender Systems

1 code implementation • 3 May 2023 • Xiong-Hui Chen, Bowei He, Yang Yu, Qingyang Li, Zhiwei Qin, Wenjie Shang, Jieping Ye, Chen Ma

However, building a user simulator with no reality-gap, i. e., can predict user's feedback exactly, is unrealistic because the users' reaction patterns are complex and historical logs for each user are limited, which might mislead the simulator-based recommendation policy.

Decision Making Recommendation Systems +1

Paper
Code

A Survey on Model-based Reinforcement Learning

no code implementations • 19 Jun 2022 • Fan-Ming Luo, Tian Xu, Hang Lai, Xiong-Hui Chen, Weinan Zhang, Yang Yu

In this survey, we take a review of MBRL with a focus on the recent progress in deep RL.

Decision Making Model-based Reinforcement Learning +3

Paper
Add Code

Offline Reinforcement Learning with Causal Structured World Models

no code implementations • 3 Jun 2022 • Zheng-Mao Zhu, Xiong-Hui Chen, Hong-Long Tian, Kun Zhang, Yang Yu

Model-based methods have recently shown promising for offline reinforcement learning (RL), aiming to learn good policies from historical data without interacting with the environment.

Model-based Reinforcement Learning Offline RL +2

Paper
Add Code

Offline Model-based Adaptable Policy Learning

1 code implementation • NeurIPS 2021 • Xiong-Hui Chen, Yang Yu, Qingyang Li, Fan-Ming Luo, Zhiwei Qin, Wenjie Shang, Jieping Ye

Current offline reinforcement learning methods commonly learn in the policy space constrained to in-support regions by the offline dataset, in order to ensure the robustness of the outcome policies.

Decision Making reinforcement-learning +1

Paper
Code

Cross-modal Domain Adaptation for Cost-Efficient Visual Reinforcement Learning

1 code implementation • NeurIPS 2021 • Xiong-Hui Chen, Shengyi Jiang, Feng Xu, Zongzhang Zhang, Yang Yu

Experiments on MuJoCo and Hand Manipulation Suite tasks show that the agents deployed with our method achieve similar performance as it has in the source domain, while those deployed with previous methods designed for same-modal domain adaptation suffer a larger performance gap.

Domain Adaptation reinforcement-learning +1

Paper
Code

Offline Adaptive Policy Leaning in Real-World Sequential Recommendation Systems

no code implementations • 1 Jan 2021 • Xiong-Hui Chen, Yang Yu, Qingyang Li, Zhiwei Tony Qin, Wenjie Shang, Yiping Meng, Jieping Ye

Instead of increasing the fidelity of models for policy learning, we handle the distortion issue via learning to adapt to diverse simulators generated by the offline dataset.

Sequential Recommendation

Paper
Add Code

Cross-Modal Domain Adaptation for Reinforcement Learning

1 code implementation • 1 Jan 2021 • Xiong-Hui Chen, Shengyi Jiang, Feng Xu, Yang Yu

Domain adaptation is a promising direction for deploying RL agents in real-world applications, where vision-based robotics tasks constitute an important part.

Domain Adaptation reinforcement-learning +1

Paper
Code

Deep exploration by novelty-pursuit with maximum state entropy

no code implementations • 25 Sep 2019 • Zi-Niu Li, Xiong-Hui Chen, Yang Yu

Efficient exploration is essential to reinforcement learning in huge state space.

Efficient Exploration

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.