Search Results for author: Jiafei Lyu

Found 15 papers, 8 papers with code

SEABO: A Simple Search-Based Method for Offline Imitation Learning

1 code implementation • 6 Feb 2024 • Jiafei Lyu, Xiaoteng Ma, Le Wan, Runze Liu, Xiu Li, Zongqing Lu

Offline reinforcement learning (RL) has attracted much attention due to its ability in learning from static offline datasets and eliminating the need of interacting with the environment.

D4RL Imitation Learning +2

Paper
Code

Understanding What Affects Generalization Gap in Visual Reinforcement Learning: Theory and Empirical Evidence

no code implementations • 5 Feb 2024 • Jiafei Lyu, Le Wan, Xiu Li, Zongqing Lu

Recently, there are many efforts attempting to learn useful policies for continuous control in visual reinforcement learning (RL).

Continuous Control Learning Theory +1

Paper
Add Code

Exploration and Anti-Exploration with Distributional Random Network Distillation

1 code implementation • 18 Jan 2024 • Kai Yang, Jian Tao, Jiafei Lyu, Xiu Li

To address this issue, we introduce the Distributional RND (DRND), a derivative of the RND.

D4RL

Paper
Code

Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model

1 code implementation • 22 Nov 2023 • Kai Yang, Jian Tao, Jiafei Lyu, Chunjiang Ge, Jiaxin Chen, Qimai Li, Weihan Shen, Xiaolong Zhu, Xiu Li

The direct preference optimization (DPO) method, effective in fine-tuning large language models, eliminates the necessity for a reward model.

Denoising

111

Paper
Code

The primacy bias in Model-based RL

no code implementations • 23 Oct 2023 • Zhongjian Qiao, Jiafei Lyu, Xiu Li

The primacy bias in deep reinforcement learning (DRL), which refers to the agent's tendency to overfit early data and lose the ability to learn from new data, can significantly decrease the performance of DRL algorithms.

Continuous Control Model-based Reinforcement Learning +1

Paper
Add Code

Zero-shot Preference Learning for Offline RL via Optimal Transport

no code implementations • 6 Jun 2023 • Runze Liu, Yali Du, Fengshuo Bai, Jiafei Lyu, Xiu Li

In this paper, we propose a novel zero-shot preference-based RL algorithm that leverages labeled preference data from source tasks to infer labels for target tasks, eliminating the requirement for human queries.

Offline RL

Paper
Add Code

Normalization Enhances Generalization in Visual Reinforcement Learning

no code implementations • 1 Jun 2023 • Lu Li, Jiafei Lyu, Guozheng Ma, Zilin Wang, Zhenjie Yang, Xiu Li, Zhiheng Li

Though normalization techniques have demonstrated huge success in supervised and unsupervised learning, their applications in visual RL are still scarce.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Off-Policy RL Algorithms Can be Sample-Efficient for Continuous Control via Sample Multiple Reuse

no code implementations • 29 May 2023 • Jiafei Lyu, Le Wan, Zongqing Lu, Xiu Li

Empirical results show that SMR significantly boosts the sample efficiency of the base methods across most of the evaluated tasks without any hyperparameter tuning or additional tricks.

Continuous Control Q-Learning +1

Paper
Add Code

Uncertainty-driven Trajectory Truncation for Data Augmentation in Offline Reinforcement Learning

1 code implementation • 10 Apr 2023 • Junjie Zhang, Jiafei Lyu, Xiaoteng Ma, Jiangpeng Yan, Jun Yang, Le Wan, Xiu Li

To empirically show the advantages of TATU, we first combine it with two classical model-based offline RL algorithms, MOPO and COMBO.

D4RL Data Augmentation +3

Paper
Code

State Advantage Weighting for Offline RL

no code implementations • 9 Oct 2022 • Jiafei Lyu, Aicheng Gong, Le Wan, Zongqing Lu, Xiu Li

We present state advantage weighting for offline reinforcement learning (RL).

D4RL Offline RL +2

Paper
Add Code

Double Check Your State Before Trusting It: Confidence-Aware Bidirectional Offline Model-Based Imagination

1 code implementation • 16 Jun 2022 • Jiafei Lyu, Xiu Li, Zongqing Lu

Model-based RL methods offer a richer dataset and benefit generalization by generating imaginary trajectories with either trained forward or reverse dynamics model.

D4RL Offline RL +1

Paper
Code

Mildly Conservative Q-Learning for Offline Reinforcement Learning

3 code implementations • 9 Jun 2022 • Jiafei Lyu, Xiaoteng Ma, Xiu Li, Zongqing Lu

The distribution shift between the learned policy and the behavior policy makes it necessary for the value function to stay conservative such that out-of-distribution (OOD) actions will not be severely overestimated.

D4RL Q-Learning +2

229

Paper
Code

Value Activation for Bias Alleviation: Generalized-activated Deep Double Deterministic Policy Gradients

1 code implementation • 21 Dec 2021 • Jiafei Lyu, Yu Yang, Jiangpeng Yan, Xiu Li

It is vital to accurately estimate the value function in Deep Reinforcement Learning (DRL) such that the agent could execute proper actions instead of suboptimal ones.

Continuous Control

Paper
Code

Efficient Continuous Control with Double Actors and Regularized Critics

1 code implementation • 6 Jun 2021 • Jiafei Lyu, Xiaoteng Ma, Jiangpeng Yan, Xiu Li

First, we uncover and demonstrate the bias alleviation property of double actors by building double actors upon single critic and double critics to handle overestimation bias in DDPG and underestimation bias in TD3 respectively.

Continuous Control Reinforcement Learning (RL)

Paper
Code

Bias-reduced Multi-step Hindsight Experience Replay for Efficient Multi-goal Reinforcement Learning

no code implementations • 25 Feb 2021 • Rui Yang, Jiafei Lyu, Yu Yang, Jiangpeng Yan, Feng Luo, Dijun Luo, Lanqing Li, Xiu Li

Two main challenges in multi-goal reinforcement learning are sparse rewards and sample inefficiency.

Multi-Goal Reinforcement Learning reinforcement-learning +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.