Search Results for author: Chenjia Bai

Found 19 papers, 9 papers with code

Diverse Randomized Value Functions: A Provably Pessimistic Approach for Offline Reinforcement Learning

no code implementations • 9 Apr 2024 • Xudong Yu, Chenjia Bai, Hongyi Guo, Changhong Wang, Zhen Wang

Offline Reinforcement Learning (RL) faces distributional shift and unreliable value estimation, especially for out-of-distribution (OOD) actions.

Reinforcement Learning (RL) Uncertainty Quantification

Paper
Add Code

Regularized Conditional Diffusion Model for Multi-Task Preference Alignment

no code implementations • 7 Apr 2024 • Xudong Yu, Chenjia Bai, Haoran He, Changhong Wang, Xuelong Li

Sequential decision-making is desired to align with human intents and exhibit versatility across various tasks.

D4RL Decision Making

Paper
Add Code

Large-Scale Actionless Video Pre-Training via Discrete Diffusion for Efficient Policy Learning

no code implementations • 22 Feb 2024 • Haoran He, Chenjia Bai, Ling Pan, Weinan Zhang, Bin Zhao, Xuelong Li

In the fine-tuning stage, we harness the imagined future videos to guide low-level action learning trained on a limited set of robot data.

Paper
Add Code

OVD-Explorer: Optimism Should Not Be the Sole Pursuit of Exploration in Noisy Environments

no code implementations • 19 Dec 2023 • Jinyi Liu, Zhi Wang, Yan Zheng, Jianye Hao, Chenjia Bai, Junjie Ye, Zhen Wang, Haiyin Piao, Yang Sun

In reinforcement learning, the optimism in the face of uncertainty (OFU) is a mainstream principle for directing exploration towards less explored areas, characterized by higher uncertainty.

Continuous Control

Paper
Add Code

Towards Robust Offline-to-Online Reinforcement Learning via Uncertainty and Smoothness

no code implementations • 29 Sep 2023 • Xiaoyu Wen, Xudong Yu, Rui Yang, Chenjia Bai, Zhen Wang

Experimental results illustrate the superiority of RO2O in facilitating stable offline-to-online learning and achieving significant improvement with limited online interactions.

Offline RL reinforcement-learning +1

Paper
Add Code

Privileged Knowledge Distillation for Sim-to-Real Policy Generalization

1 code implementation • 29 May 2023 • Haoran He, Chenjia Bai, Hang Lai, Lingxiao Wang, Weinan Zhang

In this paper, we propose a novel single-stage privileged knowledge distillation method called the Historical Information Bottleneck (HIB) to narrow the sim-to-real gap.

Knowledge Distillation Reinforcement Learning (RL)

Paper
Code

Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning

1 code implementation • NeurIPS 2023 • Haoran He, Chenjia Bai, Kang Xu, Zhuoran Yang, Weinan Zhang, Dong Wang, Bin Zhao, Xuelong Li

Specifically, we propose Multi-Task Diffusion Model (\textsc{MTDiff}), a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis in multi-task offline settings.

Reinforcement Learning (RL)

Paper
Code

On the Value of Myopic Behavior in Policy Reuse

no code implementations • 28 May 2023 • Kang Xu, Chenjia Bai, Shuang Qiu, Haoran He, Bin Zhao, Zhen Wang, Wei Li, Xuelong Li

Leveraging learned strategies in unfamiliar scenarios is fundamental to human intelligence.

Paper
Add Code

Behavior Contrastive Learning for Unsupervised Skill Discovery

1 code implementation • 8 May 2023 • Rushuai Yang, Chenjia Bai, Hongyi Guo, Siyuan Li, Bin Zhao, Zhen Wang, Peng Liu, Xuelong Li

Under mild assumptions, our objective maximizes the MI between different behaviors based on the same skill, which serves as an upper bound of the previous MI objective.

Continuous Control Contrastive Learning

Paper
Code

Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning

1 code implementation • 29 Jul 2022 • Shuang Qiu, Lingxiao Wang, Chenjia Bai, Zhuoran Yang, Zhaoran Wang

Moreover, under the online setting, we propose novel upper confidence bound (UCB)-type algorithms that incorporate such a contrastive loss with online RL algorithms for MDPs or MGs.

Contrastive Learning reinforcement-learning +3

Paper
Code

RORL: Robust Offline Reinforcement Learning via Conservative Smoothing

1 code implementation • 6 Jun 2022 • Rui Yang, Chenjia Bai, Xiaoteng Ma, Zhaoran Wang, Chongjie Zhang, Lei Han

Offline reinforcement learning (RL) provides a promising direction to exploit massive amount of offline data for complex decision-making tasks.

Decision Making Offline RL +2

Paper
Code

Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning

1 code implementation • ICLR 2022 • Chenjia Bai, Lingxiao Wang, Zhuoran Yang, Zhihong Deng, Animesh Garg, Peng Liu, Zhaoran Wang

We show that such OOD sampling and pessimistic bootstrapping yields provable uncertainty quantifier in linear MDPs, thus providing the theoretical underpinning for PBRL.

D4RL Offline RL +3

Paper
Code

False Correlation Reduction for Offline Reinforcement Learning

1 code implementation • 24 Oct 2021 • Zhihong Deng, Zuyue Fu, Lingxiao Wang, Zhuoran Yang, Chenjia Bai, Tianyi Zhou, Zhaoran Wang, Jing Jiang

Offline reinforcement learning (RL) harnesses the power of massive datasets for resolving sequential decision problems.

D4RL Decision Making +3

Paper
Code

Dynamic Bottleneck for Robust Self-Supervised Exploration

1 code implementation • NeurIPS 2021 • Chenjia Bai, Lingxiao Wang, Lei Han, Animesh Garg, Jianye Hao, Peng Liu, Zhaoran Wang

Exploration methods based on pseudo-count of transitions or curiosity of dynamics have achieved promising results in solving reinforcement learning with sparse rewards.

Paper
Code

OVD-Explorer: A General Information-theoretic Exploration Approach for Reinforcement Learning

no code implementations • 29 Sep 2021 • Jinyi Liu, Zhi Wang, Yan Zheng, Jianye Hao, Junjie Ye, Chenjia Bai, Pengyi Li

Many exploration strategies are built upon the optimism in the face of the uncertainty (OFU) principle for reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain

no code implementations • 14 Sep 2021 • Jianye Hao, Tianpei Yang, Hongyao Tang, Chenjia Bai, Jinyi Liu, Zhaopeng Meng, Peng Liu, Zhen Wang

In addition to algorithmic analysis, we provide a comprehensive and unified empirical comparison of different exploration methods for DRL on a set of commonly used benchmarks.

Autonomous Vehicles Efficient Exploration +3

Paper
Add Code

Principled Exploration via Optimistic Bootstrapping and Backward Induction

1 code implementation • 13 May 2021 • Chenjia Bai, Lingxiao Wang, Lei Han, Jianye Hao, Animesh Garg, Peng Liu, Zhaoran Wang

In this paper, we propose a principled exploration method for DRL through Optimistic Bootstrapping and Backward Induction (OB2I).

Efficient Exploration Reinforcement Learning (RL)

Paper
Code

Optimistic Exploration with Backward Bootstrapped Bonus for Deep Reinforcement Learning

no code implementations • 1 Jan 2021 • Chenjia Bai, Lingxiao Wang, Peng Liu, Zhaoran Wang, Jianye Hao, Yingnan Zhao

However, such an approach is challenging in developing practical exploration algorithms for Deep Reinforcement Learning (DRL).

Atari Games Efficient Exploration +3

Paper
Add Code

Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning

no code implementations • 17 Oct 2020 • Chenjia Bai, Peng Liu, Kaiyu Liu, Lingxiao Wang, Yingnan Zhao, Lei Han

Efficient exploration remains a challenging problem in reinforcement learning, especially for tasks where extrinsic rewards from environments are sparse or even totally disregarded.

Efficient Exploration reinforcement-learning +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.