Search Results for author: Yiqin Yang

Found 11 papers, 5 papers with code

Episodic Novelty Through Temporal Distance

no code implementations26 Jan 2025 Yuhua Jiang, Qihan Liu, Yiqin Yang, Xiaoteng Ma, Dianyu Zhong, Hao Hu, Jun Yang, Bin Liang, Bo Xu, Chongjie Zhang, Qianchuan Zhao

Exploration in sparse reward environments remains a significant challenge in reinforcement learning, particularly in Contextual Markov Decision Processes (CMDPs), where environments differ across episodes.

Contrastive Learning

S-EPOA: Overcoming the Indistinguishability of Segments with Skill-Driven Preference-Based Reinforcement Learning

no code implementations22 Aug 2024 Ni Mu, Yao Luan, Yiqin Yang, Qing-Shan Jia

Preference-based reinforcement learning (PbRL) stands out by utilizing human preferences as a direct reward signal, eliminating the need for intricate reward engineering.

Bayesian Design Principles for Offline-to-Online Reinforcement Learning

1 code implementation31 May 2024 Hao Hu, Yiqin Yang, Jianing Ye, Chengjie WU, Ziqing Mai, Yujing Hu, Tangjie Lv, Changjie Fan, Qianchuan Zhao, Chongjie Zhang

In this paper, we tackle the fundamental dilemma of offline-to-online fine-tuning: if the agent remains pessimistic, it may fail to learn a better policy, while if it becomes optimistic directly, performance may suffer from a sudden drop.

reinforcement-learning Reinforcement Learning +1

No Prior Mask: Eliminate Redundant Action for Deep Reinforcement Learning

1 code implementation11 Dec 2023 Dianyu Zhong, Yiqin Yang, Qianchuan Zhao

The large action space is one fundamental obstacle to deploying Reinforcement Learning methods in the real world.

Deep Reinforcement Learning reinforcement-learning

Learning Diverse Risk Preferences in Population-based Self-play

1 code implementation19 May 2023 Yuhua Jiang, Qihan Liu, Xiaoteng Ma, Chenghao Li, Yiqin Yang, Jun Yang, Bin Liang, Qianchuan Zhao

In this paper, we aim to introduce diversity from the perspective that agents could have diverse risk preferences in the face of uncertainty.

Diversity reinforcement-learning +2

The Provable Benefits of Unsupervised Data Sharing for Offline Reinforcement Learning

no code implementations27 Feb 2023 Hao Hu, Yiqin Yang, Qianchuan Zhao, Chongjie Zhang

Self-supervised methods have become crucial for advancing deep learning by leveraging data itself to reduce the need for expensive annotations.

Offline RL reinforcement-learning +1

On the Role of Discount Factor in Offline Reinforcement Learning

no code implementations7 Jun 2022 Hao Hu, Yiqin Yang, Qianchuan Zhao, Chongjie Zhang

The discount factor, $\gamma$, plays a vital role in improving online RL sample efficiency and estimation accuracy, but the role of the discount factor in offline RL is not well explored.

D4RL Offline RL +3

Offline Reinforcement Learning with Value-based Episodic Memory

1 code implementation ICLR 2022 Xiaoteng Ma, Yiqin Yang, Hao Hu, Qihan Liu, Jun Yang, Chongjie Zhang, Qianchuan Zhao, Bin Liang

Offline reinforcement learning (RL) shows promise of applying RL to real-world problems by effectively utilizing previously collected data.

D4RL Offline RL +3

Modeling the Interaction between Agents in Cooperative Multi-Agent Reinforcement Learning

no code implementations10 Feb 2021 Xiaoteng Ma, Yiqin Yang, Chenghao Li, Yiwen Lu, Qianchuan Zhao, Yang Jun

Value-based methods of multi-agent reinforcement learning (MARL), especially the value decomposition methods, have been demonstrated on a range of challenging cooperative tasks.

continuous-control Continuous Control +3

Cannot find the paper you are looking for? You can Submit a new open access paper.