Search Results for author: Shaohui Peng

Found 14 papers, 5 papers with code

Prompt-based Visual Alignment for Zero-shot Policy Transfer

no code implementations5 Jun 2024 Haihan Gao, Rui Zhang, Qi Yi, Hantao Yao, Haochen Li, Jiaming Guo, Shaohui Peng, Yunkai Gao, Qicheng Wang, Xing Hu, Yuanbo Wen, Zihao Zhang, Zidong Du, Ling Li, Qi Guo, Yunji Chen

With explicit constraints of semantic information, PVA can learn unified cross-domain representation under limited access to cross-domain data and achieves great zero-shot generalization ability in unseen domains.

Autonomous Driving Language Modelling +2

Online Prototype Alignment for Few-shot Policy Transfer

1 code implementation12 Jun 2023 Qi Yi, Rui Zhang, Shaohui Peng, Jiaming Guo, Yunkai Gao, Kaizhao Yuan, Ruizhi Chen, Siming Lan, Xing Hu, Zidong Du, Xishan Zhang, Qi Guo, Yunji Chen

Domain adaptation in reinforcement learning (RL) mainly deals with the changes of observation when transferring the policy to a new environment.

Domain Adaptation Reinforcement Learning (RL)

ANPL: Towards Natural Programming with Interactive Decomposition

1 code implementation NeurIPS 2023 Di Huang, Ziyuan Nan, Xing Hu, Pengwei Jin, Shaohui Peng, Yuanbo Wen, Rui Zhang, Zidong Du, Qi Guo, Yewen Pu, Yunji Chen

We deploy ANPL on the Abstraction and Reasoning Corpus (ARC), a set of unique tasks that are challenging for state-of-the-art AI systems, showing it outperforms baseline programming systems that (a) without the ability to decompose tasks interactively and (b) without the guarantee that the modules can be correctly composed together.

ARC Code Generation +2

Conceptual Reinforcement Learning for Language-Conditioned Tasks

no code implementations9 Mar 2023 Shaohui Peng, Xing Hu, Rui Zhang, Jiaming Guo, Qi Yi, Ruizhi Chen, Zidong Du, Ling Li, Qi Guo, Yunji Chen

Recently, the language-conditioned policy is proposed to facilitate policy transfer through learning the joint representation of observation and text that catches the compact and invariant information across environments.

Deep Reinforcement Learning reinforcement-learning +1

Object-Category Aware Reinforcement Learning

no code implementations13 Oct 2022 Qi Yi, Rui Zhang, Shaohui Peng, Jiaming Guo, Xing Hu, Zidong Du, Xishan Zhang, Qi Guo, Yunji Chen

Object-oriented reinforcement learning (OORL) is a promising way to improve the sample efficiency and generalization ability over standard RL.

Feature Engineering Object +4

Causality-driven Hierarchical Structure Discovery for Reinforcement Learning

no code implementations13 Oct 2022 Shaohui Peng, Xing Hu, Rui Zhang, Ke Tang, Jiaming Guo, Qi Yi, Ruizhi Chen, Xishan Zhang, Zidong Du, Ling Li, Qi Guo, Yunji Chen

To address this issue, we propose CDHRL, a causality-driven hierarchical reinforcement learning framework, leveraging a causality-driven discovery instead of a randomness-driven exploration to effectively build high-quality hierarchical structures in complicated environments.

Hierarchical Reinforcement Learning Minecraft +3

Hindsight Value Function for Variance Reduction in Stochastic Dynamic Environment

1 code implementation26 Jul 2021 Jiaming Guo, Rui Zhang, Xishan Zhang, Shaohui Peng, Qi Yi, Zidong Du, Xing Hu, Qi Guo, Yunji Chen

In this paper, we propose to replace the state value function with a novel hindsight value function, which leverages the information from the future to reduce the variance of the gradient estimate for stochastic dynamic environments.

Deep Reinforcement Learning Policy Gradient Methods

Cannot find the paper you are looking for? You can Submit a new open access paper.