Search Results for author: Qianchuan Zhao

Found 16 papers, 6 papers with code

No Prior Mask: Eliminate Redundant Action for Deep Reinforcement Learning

1 code implementation • 11 Dec 2023 • Dianyu Zhong, Yiqin Yang, Qianchuan Zhao

The large action space is one fundamental obstacle to deploying Reinforcement Learning methods in the real world.

Paper
Code

Never Explore Repeatedly in Multi-Agent Reinforcement Learning

no code implementations • 19 Aug 2023 • Chenghao Li, Tonghan Wang, Chongjie Zhang, Qianchuan Zhao

In the realm of multi-agent reinforcement learning, intrinsic motivations have emerged as a pivotal tool for exploration.

Multi-agent Reinforcement Learning reinforcement-learning +2

Paper
Add Code

Learning Diverse Risk Preferences in Population-based Self-play

1 code implementation • 19 May 2023 • Yuhua Jiang, Qihan Liu, Xiaoteng Ma, Chenghao Li, Yiqin Yang, Jun Yang, Bin Liang, Qianchuan Zhao

In this paper, we aim to introduce diversity from the perspective that agents could have diverse risk preferences in the face of uncertainty.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

The Provable Benefits of Unsupervised Data Sharing for Offline Reinforcement Learning

no code implementations • 27 Feb 2023 • Hao Hu, Yiqin Yang, Qianchuan Zhao, Chongjie Zhang

Self-supervised methods have become crucial for advancing deep learning by leveraging data itself to reduce the need for expensive annotations.

Offline RL reinforcement-learning +1

Paper
Add Code

Flow to Control: Offline Reinforcement Learning with Lossless Primitive Discovery

no code implementations • 2 Dec 2022 • Yiqin Yang, Hao Hu, Wenzhe Li, Siyuan Li, Jun Yang, Qianchuan Zhao, Chongjie Zhang

We show that such lossless primitives can drastically improve the performance of hierarchical policies.

D4RL reinforcement-learning +1

Paper
Add Code

Distributionally Robust Offline Reinforcement Learning with Linear Function Approximation

no code implementations • 14 Sep 2022 • Xiaoteng Ma, Zhipeng Liang, Jose Blanchet, Mingwen Liu, Li Xia, Jiheng Zhang, Qianchuan Zhao, Zhengyuan Zhou

Among the reasons hindering reinforcement learning (RL) applications to real-world problems, two factors are critical: limited data and the mismatch between the testing environment (real environment in which the policy is deployed) and the training environment (e. g., a simulator).

Offline RL reinforcement-learning +1

Paper
Add Code

Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning

no code implementations • 15 Jun 2022 • Xiaoteng Ma, Shuai Ma, Li Xia, Qianchuan Zhao

Keeping risk under control is often more crucial than maximizing expected rewards in real-world decision-making situations, such as finance, robotics, autonomous driving, etc.

Autonomous Driving Continuous Control +3

Paper
Add Code

On the Role of Discount Factor in Offline Reinforcement Learning

no code implementations • 7 Jun 2022 • Hao Hu, Yiqin Yang, Qianchuan Zhao, Chongjie Zhang

The discount factor, $\gamma$, plays a vital role in improving online RL sample efficiency and estimation accuracy, but the role of the discount factor in offline RL is not well explored.

D4RL Offline RL +2

Paper
Add Code

Offline Reinforcement Learning with Value-based Episodic Memory

1 code implementation • ICLR 2022 • Xiaoteng Ma, Yiqin Yang, Hao Hu, Qihan Liu, Jun Yang, Chongjie Zhang, Qianchuan Zhao, Bin Liang

Offline reinforcement learning (RL) shows promise of applying RL to real-world problems by effectively utilizing previously collected data.

D4RL Offline RL +2

Paper
Code

MPSN: Motion-aware Pseudo Siamese Network for Indoor Video Head Detection in Buildings

1 code implementation • 7 Oct 2021 • Kailai Sun, Xiaoteng Ma, Peng Liu, Qianchuan Zhao

Head detection in the indoor video is an essential component of building occupancy detection.

Head Detection Model Selection +1

Paper
Code

Average-Reward Reinforcement Learning with Trust Region Methods

no code implementations • 7 Jun 2021 • Xiaoteng Ma, Xiaohang Tang, Li Xia, Jun Yang, Qianchuan Zhao

Our work provides a unified framework of the trust region approach including both the discounted and average criteria, which may complement the framework of reinforcement learning beyond the discounted objectives.

Continuous Control reinforcement-learning +1

Paper
Add Code

Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning

1 code implementation • NeurIPS 2021 • Yiqin Yang, Xiaoteng Ma, Chenghao Li, Zewu Zheng, Qiyuan Zhang, Gao Huang, Jun Yang, Qianchuan Zhao

Moreover, we extend ICQ to multi-agent tasks by decomposing the joint-policy under the implicit constraint.

Multi-agent Reinforcement Learning Offline RL +5

Paper
Code

Celebrating Diversity in Shared Multi-Agent Reinforcement Learning

2 code implementations • NeurIPS 2021 • Chenghao Li, Tonghan Wang, Chengjie WU, Qianchuan Zhao, Jun Yang, Chongjie Zhang

Recently, deep multi-agent reinforcement learning (MARL) has shown the promise to solve complex cooperative tasks.

Multi-agent Reinforcement Learning reinforcement-learning +3

Paper
Code

Modeling the Interaction between Agents in Cooperative Multi-Agent Reinforcement Learning

no code implementations • 10 Feb 2021 • Xiaoteng Ma, Yiqin Yang, Chenghao Li, Yiwen Lu, Qianchuan Zhao, Yang Jun

Value-based methods of multi-agent reinforcement learning (MARL), especially the value decomposition methods, have been demonstrated on a range of challenging cooperative tasks.

Continuous Control Multi-agent Reinforcement Learning +2

Paper
Add Code

SOAC: The Soft Option Actor-Critic Architecture

no code implementations • 25 Jun 2020 • Chenghao Li, Xiaoteng Ma, Chongjie Zhang, Jun Yang, Li Xia, Qianchuan Zhao

In these tasks, our approach learns a diverse set of options, each of whose state-action space has strong coherence.

Transfer Learning

Paper
Add Code

DSAC: Distributional Soft Actor Critic for Risk-Sensitive Reinforcement Learning

no code implementations • 30 Apr 2020 • Xiaoteng Ma, Li Xia, Zhengyuan Zhou, Jun Yang, Qianchuan Zhao

In this paper, we present a new reinforcement learning (RL) algorithm called Distributional Soft Actor Critic (DSAC), which exploits the distributional information of accumulated rewards to achieve better performance.

Continuous Control reinforcement-learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.