Search Results for author: Pushi Zhang

Found 6 papers, 1 papers with code

Preference-conditioned Pixel-based AI Agent For Game Testing

no code implementations18 Aug 2023 Sherif Abdelfattah, Adrian Brown, Pushi Zhang

This paper addresses these limitations by proposing an agent design that mainly depends on pixel-based state observations while exploring the environment conditioned on a user's preference specified by demonstration trajectories.

Imitation Learning

Asking Before Action: Gather Information in Embodied Decision Making with Language Models

no code implementations25 May 2023 Xiaoyu Chen, Shenao Zhang, Pushi Zhang, Li Zhao, Jianyu Chen

With strong capabilities of reasoning and a generic understanding of the world, Large Language Models (LLMs) have shown great potential in building versatile embodied decision making agents capable of performing diverse tasks.

Imitation Learning

An Adaptive Deep RL Method for Non-Stationary Environments with Piecewise Stable Context

no code implementations24 Dec 2022 Xiaoyu Chen, Xiangming Zhu, Yufeng Zheng, Pushi Zhang, Li Zhao, Wenxue Cheng, Peng Cheng, Yongqiang Xiong, Tao Qin, Jianyu Chen, Tie-Yan Liu

One of the key challenges in deploying RL to real-world applications is to adapt to variations of unknown environment contexts, such as changing terrains in robotic tasks and fluctuated bandwidth in congestion control.

Distributional Reinforcement Learning for Multi-Dimensional Reward Functions

1 code implementation NeurIPS 2021 Pushi Zhang, Xiaoyu Chen, Li Zhao, Wei Xiong, Tao Qin, Tie-Yan Liu

To fully inherit the benefits of distributional RL and hybrid reward architectures, we introduce Multi-Dimensional Distributional DQN (MD3QN), which extends distributional RL to model the joint return distribution from multiple reward sources.

Distributional Reinforcement Learning reinforcement-learning +1

Demonstration Actor Critic

no code implementations25 Sep 2019 Guoqing Liu, Li Zhao, Pushi Zhang, Jiang Bian, Tao Qin, Nenghai Yu, Tie-Yan Liu

One approach leverages demonstration data in a supervised manner, which is simple and direct, but can only provide supervision signal over those states seen in the demonstrations.

Independence-aware Advantage Estimation

no code implementations25 Sep 2019 Pushi Zhang, Li Zhao, Guoqing Liu, Jiang Bian, Minglie Huang, Tao Qin, Tie-Yan Liu

Most of existing advantage function estimation methods in reinforcement learning suffer from the problem of high variance, which scales unfavorably with the time horizon.

Cannot find the paper you are looking for? You can Submit a new open access paper.