Search Results for author: Xiong-Hui Chen

Found 10 papers, 4 papers with code

Language Model Self-improvement by Reinforcement Learning Contemplation

no code implementations23 May 2023 Jing-Cheng Pang, Pengyuan Wang, Kaiyuan Li, Xiong-Hui Chen, Jiacheng Xu, Zongzhang Zhang, Yang Yu

We demonstrate that SIRLC can be applied to various NLP tasks, such as reasoning problems, text generation, and machine translation.

Language Modelling Machine Translation +3

Sim2Rec: A Simulator-based Decision-making Approach to Optimize Real-World Long-term User Engagement in Sequential Recommender Systems

1 code implementation3 May 2023 Xiong-Hui Chen, Bowei He, Yang Yu, Qingyang Li, Zhiwei Qin, Wenjie Shang, Jieping Ye, Chen Ma

However, building a user simulator with no reality-gap, i. e., can predict user's feedback exactly, is unrealistic because the users' reaction patterns are complex and historical logs for each user are limited, which might mislead the simulator-based recommendation policy.

Decision Making Recommendation Systems +1

Offline Reinforcement Learning with Causal Structured World Models

no code implementations3 Jun 2022 Zheng-Mao Zhu, Xiong-Hui Chen, Hong-Long Tian, Kun Zhang, Yang Yu

Model-based methods have recently shown promising for offline reinforcement learning (RL), aiming to learn good policies from historical data without interacting with the environment.

Model-based Reinforcement Learning Offline RL +2

Cross-modal Domain Adaptation for Cost-Efficient Visual Reinforcement Learning

1 code implementation NeurIPS 2021 Xiong-Hui Chen, Shengyi Jiang, Feng Xu, Zongzhang Zhang, Yang Yu

Experiments on MuJoCo and Hand Manipulation Suite tasks show that the agents deployed with our method achieve similar performance as it has in the source domain, while those deployed with previous methods designed for same-modal domain adaptation suffer a larger performance gap.

Domain Adaptation reinforcement-learning +1

Offline Model-based Adaptable Policy Learning

1 code implementation NeurIPS 2021 Xiong-Hui Chen, Yang Yu, Qingyang Li, Fan-Ming Luo, Zhiwei Qin, Wenjie Shang, Jieping Ye

Current offline reinforcement learning methods commonly learn in the policy space constrained to in-support regions by the offline dataset, in order to ensure the robustness of the outcome policies.

Decision Making reinforcement-learning +1

Offline Adaptive Policy Leaning in Real-World Sequential Recommendation Systems

no code implementations1 Jan 2021 Xiong-Hui Chen, Yang Yu, Qingyang Li, Zhiwei Tony Qin, Wenjie Shang, Yiping Meng, Jieping Ye

Instead of increasing the fidelity of models for policy learning, we handle the distortion issue via learning to adapt to diverse simulators generated by the offline dataset.

Sequential Recommendation

Cross-Modal Domain Adaptation for Reinforcement Learning

1 code implementation1 Jan 2021 Xiong-Hui Chen, Shengyi Jiang, Feng Xu, Yang Yu

Domain adaptation is a promising direction for deploying RL agents in real-world applications, where vision-based robotics tasks constitute an important part.

Domain Adaptation reinforcement-learning +1

Deep exploration by novelty-pursuit with maximum state entropy

no code implementations25 Sep 2019 Zi-Niu Li, Xiong-Hui Chen, Yang Yu

Efficient exploration is essential to reinforcement learning in huge state space.

Efficient Exploration

Cannot find the paper you are looking for? You can Submit a new open access paper.