Search Results for author: Qianchuan Zhao

Found 16 papers, 6 papers with code

No Prior Mask: Eliminate Redundant Action for Deep Reinforcement Learning

1 code implementation11 Dec 2023 Dianyu Zhong, Yiqin Yang, Qianchuan Zhao

The large action space is one fundamental obstacle to deploying Reinforcement Learning methods in the real world.

reinforcement-learning

Never Explore Repeatedly in Multi-Agent Reinforcement Learning

no code implementations19 Aug 2023 Chenghao Li, Tonghan Wang, Chongjie Zhang, Qianchuan Zhao

In the realm of multi-agent reinforcement learning, intrinsic motivations have emerged as a pivotal tool for exploration.

Multi-agent Reinforcement Learning reinforcement-learning +2

Learning Diverse Risk Preferences in Population-based Self-play

1 code implementation19 May 2023 Yuhua Jiang, Qihan Liu, Xiaoteng Ma, Chenghao Li, Yiqin Yang, Jun Yang, Bin Liang, Qianchuan Zhao

In this paper, we aim to introduce diversity from the perspective that agents could have diverse risk preferences in the face of uncertainty.

reinforcement-learning Reinforcement Learning (RL)

The Provable Benefits of Unsupervised Data Sharing for Offline Reinforcement Learning

no code implementations27 Feb 2023 Hao Hu, Yiqin Yang, Qianchuan Zhao, Chongjie Zhang

Self-supervised methods have become crucial for advancing deep learning by leveraging data itself to reduce the need for expensive annotations.

Offline RL reinforcement-learning +1

Distributionally Robust Offline Reinforcement Learning with Linear Function Approximation

no code implementations14 Sep 2022 Xiaoteng Ma, Zhipeng Liang, Jose Blanchet, Mingwen Liu, Li Xia, Jiheng Zhang, Qianchuan Zhao, Zhengyuan Zhou

Among the reasons hindering reinforcement learning (RL) applications to real-world problems, two factors are critical: limited data and the mismatch between the testing environment (real environment in which the policy is deployed) and the training environment (e. g., a simulator).

Offline RL reinforcement-learning +1

Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning

no code implementations15 Jun 2022 Xiaoteng Ma, Shuai Ma, Li Xia, Qianchuan Zhao

Keeping risk under control is often more crucial than maximizing expected rewards in real-world decision-making situations, such as finance, robotics, autonomous driving, etc.

Autonomous Driving Continuous Control +3

On the Role of Discount Factor in Offline Reinforcement Learning

no code implementations7 Jun 2022 Hao Hu, Yiqin Yang, Qianchuan Zhao, Chongjie Zhang

The discount factor, $\gamma$, plays a vital role in improving online RL sample efficiency and estimation accuracy, but the role of the discount factor in offline RL is not well explored.

D4RL Offline RL +2

Offline Reinforcement Learning with Value-based Episodic Memory

1 code implementation ICLR 2022 Xiaoteng Ma, Yiqin Yang, Hao Hu, Qihan Liu, Jun Yang, Chongjie Zhang, Qianchuan Zhao, Bin Liang

Offline reinforcement learning (RL) shows promise of applying RL to real-world problems by effectively utilizing previously collected data.

D4RL Offline RL +2

Average-Reward Reinforcement Learning with Trust Region Methods

no code implementations7 Jun 2021 Xiaoteng Ma, Xiaohang Tang, Li Xia, Jun Yang, Qianchuan Zhao

Our work provides a unified framework of the trust region approach including both the discounted and average criteria, which may complement the framework of reinforcement learning beyond the discounted objectives.

Continuous Control reinforcement-learning +1

Modeling the Interaction between Agents in Cooperative Multi-Agent Reinforcement Learning

no code implementations10 Feb 2021 Xiaoteng Ma, Yiqin Yang, Chenghao Li, Yiwen Lu, Qianchuan Zhao, Yang Jun

Value-based methods of multi-agent reinforcement learning (MARL), especially the value decomposition methods, have been demonstrated on a range of challenging cooperative tasks.

Continuous Control Multi-agent Reinforcement Learning +2

SOAC: The Soft Option Actor-Critic Architecture

no code implementations25 Jun 2020 Chenghao Li, Xiaoteng Ma, Chongjie Zhang, Jun Yang, Li Xia, Qianchuan Zhao

In these tasks, our approach learns a diverse set of options, each of whose state-action space has strong coherence.

Transfer Learning

DSAC: Distributional Soft Actor Critic for Risk-Sensitive Reinforcement Learning

no code implementations30 Apr 2020 Xiaoteng Ma, Li Xia, Zhengyuan Zhou, Jun Yang, Qianchuan Zhao

In this paper, we present a new reinforcement learning (RL) algorithm called Distributional Soft Actor Critic (DSAC), which exploits the distributional information of accumulated rewards to achieve better performance.

Continuous Control reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.