Search Results for author: Jiaji Zhang

Found 4 papers, 3 papers with code

Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts

no code implementations • 14 Apr 2024 • Jing-Cheng Pang, Si-Hang Yang, Kaiyuan Li, Jiaji Zhang, Xiong-Hui Chen, Nan Tang, Yang Yu

Furthermore, KALM effectively enables the LLM to comprehend environmental dynamics, resulting in the generation of meaningful imaginary rollouts that reflect novel skills and demonstrate the seamless integration of large language models and reinforcement learning.

Language Modelling Large Language Model +2

Paper
Add Code

Episodic Return Decomposition by Difference of Implicitly Assigned Sub-Trajectory Reward

1 code implementation • 17 Dec 2023 • Haoxin Lin, Hongqiu Wu, Jiaji Zhang, Yihao Sun, Junyin Ye, Yang Yu

Real-world decision-making problems are usually accompanied by delayed rewards, which affects the sample efficiency of Reinforcement Learning, especially in the extremely delayed case where the only feedback is the episodic reward obtained at the end of an episode.

Decision Making

Paper
Code

Model-Bellman Inconsistency for Model-based Offline Reinforcement Learning

2 code implementations • PMLR 2023 • Yihao Sun, Jiaji Zhang, Chengxing Jia, Haoxin Lin, Junyin Ye, Yang Yu

MOBILE conducts uncertainty quantification through the inconsistency of Bellman estimations under an ensemble of learned dynamics models, which can be a better approximator to the true Bellman error, and penalizes the Bellman estimation based on this uncertainty.

D4RL Offline RL +3

232

Paper
Code

Model-based Reinforcement Learning with Multi-step Plan Value Estimation

1 code implementation • 12 Sep 2022 • Haoxin Lin, Yihao Sun, Jiaji Zhang, Yang Yu

The new model-based reinforcement learning algorithm MPPVE (Model-based Planning Policy Learning with Multi-step Plan Value Estimation) shows a better utilization of the learned model and achieves a better sample efficiency than state-of-the-art model-based RL approaches.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.