Search Results for author: Chongjie Zhang

Found 67 papers, 33 papers with code

GOMAA-Geo: GOal Modality Agnostic Active Geo-localization

2 code implementations4 Jun 2024 Anindya Sarkar, Srikumar Sastry, Aleksis Pirinen, Chongjie Zhang, Nathan Jacobs, Yevgeniy Vorobeychik

We consider the task of active geo-localization (AGL) in which an agent uses a sequence of visual cues observed during aerial navigation to find a target specified through multiple possible modalities.

Contrastive Learning Zero-shot Generalization

Bayesian Design Principles for Offline-to-Online Reinforcement Learning

1 code implementation31 May 2024 Hao Hu, Yiqin Yang, Jianing Ye, Chengjie WU, Ziqing Mai, Yujing Hu, Tangjie Lv, Changjie Fan, Qianchuan Zhao, Chongjie Zhang

In this paper, we tackle the fundamental dilemma of offline-to-online fine-tuning: if the agent remains pessimistic, it may fail to learn a better policy, while if it becomes optimistic directly, performance may suffer from a sudden drop.

reinforcement-learning Reinforcement Learning (RL)

Efficient Multi-agent Reinforcement Learning by Planning

1 code implementation20 May 2024 Qihan Liu, Jianing Ye, Xiaoteng Ma, Jun Yang, Bin Liang, Chongjie Zhang

Extensive experiments on the SMAC benchmark demonstrate that MAZero outperforms model-free approaches in terms of sample efficiency and provides comparable or better performance than existing model-based methods in terms of both sample and computational efficiency.

Computational Efficiency Model-based Reinforcement Learning +2

Leveraging Hyperbolic Embeddings for Coarse-to-Fine Robot Design

no code implementations1 Nov 2023 Heng Dong, Junyu Zhang, Chongjie Zhang

Multi-cellular robot design aims to create robots comprised of numerous cells that can be efficiently controlled to perform diverse tasks.

Towards Robust Offline Reinforcement Learning under Diverse Data Corruption

2 code implementations19 Oct 2023 Rui Yang, Han Zhong, Jiawei Xu, Amy Zhang, Chongjie Zhang, Lei Han, Tong Zhang

Offline reinforcement learning (RL) presents a promising approach for learning reinforced policies from offline datasets without the need for costly or unsafe interactions with the environment.

Offline RL Q-Learning +2

Imitation Learning from Observation with Automatic Discount Scheduling

no code implementations11 Oct 2023 Yuyang Liu, Weijun Dong, Yingdong Hu, Chuan Wen, Zhao-Heng Yin, Chongjie Zhang, Yang Gao

Nonetheless, we identify that tasks characterized by a progress dependency property pose significant challenges for such approaches; in these tasks, the agent needs to initially learn the expert's preceding behaviors before mastering the subsequent ones.

Imitation Learning reinforcement-learning +1

Never Explore Repeatedly in Multi-Agent Reinforcement Learning

no code implementations19 Aug 2023 Chenghao Li, Tonghan Wang, Chongjie Zhang, Qianchuan Zhao

In the realm of multi-agent reinforcement learning, intrinsic motivations have emerged as a pivotal tool for exploration.

Multi-agent Reinforcement Learning reinforcement-learning +2

IOB: Integrating Optimization Transfer and Behavior Transfer for Multi-Policy Reuse

no code implementations14 Aug 2023 Siyuan Li, Hao Li, Jin Zhang, Zhen Wang, Peng Liu, Chongjie Zhang

Humans have the ability to reuse previously learned policies to solve new tasks quickly, and reinforcement learning (RL) agents can do the same by transferring knowledge from source policies to a related target task.

Continual Learning Reinforcement Learning (RL)

Learning to Solve Tasks with Exploring Prior Behaviours

1 code implementation6 Jul 2023 Ruiqi Zhu, Siyuan Li, Tianhong Dai, Chongjie Zhang, Oya Celiktutan

Our method can endow agents with the ability to explore and acquire the required prior behaviours and then connect to the task-specific behaviours in the demonstration to solve sparse-reward tasks without requiring additional demonstration of the prior behaviours.

Symmetry-Aware Robot Design with Structured Subgroups

1 code implementation31 May 2023 Heng Dong, Junyu Zhang, Tonghan Wang, Chongjie Zhang

Robot design aims at learning to create robots that can be easily controlled and perform tasks efficiently.

What is Essential for Unseen Goal Generalization of Offline Goal-conditioned RL?

1 code implementation30 May 2023 Rui Yang, Yong Lin, Xiaoteng Ma, Hao Hu, Chongjie Zhang, Tong Zhang

In this paper, we study out-of-distribution (OOD) generalization of offline GCRL both theoretically and empirically to identify factors that are important.

Imitation Learning Offline RL

The Provable Benefits of Unsupervised Data Sharing for Offline Reinforcement Learning

no code implementations27 Feb 2023 Hao Hu, Yiqin Yang, Qianchuan Zhao, Chongjie Zhang

Self-supervised methods have become crucial for advancing deep learning by leveraging data itself to reduce the need for expensive annotations.

Offline RL reinforcement-learning +1

A Survey on Transformers in Reinforcement Learning

no code implementations8 Jan 2023 Wenzhe Li, Hao Luo, Zichuan Lin, Chongjie Zhang, Zongqing Lu, Deheng Ye

Transformer has been considered the dominating neural architecture in NLP and CV, mostly under supervised settings.

reinforcement-learning Reinforcement Learning (RL)

Low-Rank Modular Reinforcement Learning via Muscle Synergy

1 code implementation26 Oct 2022 Heng Dong, Tonghan Wang, Jiayuan Liu, Chongjie Zhang

Modular Reinforcement Learning (RL) decentralizes the control of multi-joint robots by learning policies for each actuator.

reinforcement-learning Reinforcement Learning (RL)

Non-Linear Coordination Graphs

no code implementations26 Oct 2022 Yipeng Kang, Tonghan Wang, Xiaoran Wu, Qianlan Yang, Chongjie Zhang

Value decomposition multi-agent reinforcement learning methods learn the global value function as a mixing of each agent's individual utility functions.

Multi-agent Reinforcement Learning

CUP: Critic-Guided Policy Reuse

1 code implementation15 Oct 2022 Jin Zhang, Siyuan Li, Chongjie Zhang

The ability to reuse previous policies is an important aspect of human intelligence.

On the Role of Discount Factor in Offline Reinforcement Learning

no code implementations7 Jun 2022 Hao Hu, Yiqin Yang, Qianchuan Zhao, Chongjie Zhang

The discount factor, $\gamma$, plays a vital role in improving online RL sample efficiency and estimation accuracy, but the role of the discount factor in offline RL is not well explored.

D4RL Offline RL +2

RORL: Robust Offline Reinforcement Learning via Conservative Smoothing

1 code implementation6 Jun 2022 Rui Yang, Chenjia Bai, Xiaoteng Ma, Zhaoran Wang, Chongjie Zhang, Lei Han

Offline reinforcement learning (RL) provides a promising direction to exploit massive amount of offline data for complex decision-making tasks.

Decision Making Offline RL +2

Latent-Variable Advantage-Weighted Policy Optimization for Offline RL

1 code implementation16 Mar 2022 Xi Chen, Ali Ghadirzadeh, Tianhe Yu, Yuan Gao, Jianhao Wang, Wenzhe Li, Bin Liang, Chelsea Finn, Chongjie Zhang

Offline reinforcement learning methods hold the promise of learning policies from pre-collected datasets without the need to query the environment for new transitions.

Continuous Control Offline RL +2

Multi-Agent Policy Transfer via Task Relationship Modeling

no code implementations9 Mar 2022 Rongjun Qin, Feng Chen, Tonghan Wang, Lei Yuan, Xiaoran Wu, Zongzhang Zhang, Chongjie Zhang, Yang Yu

We demonstrate that the task representation can capture the relationship among tasks, and can generalize to unseen tasks.

Transfer Learning

Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RL

1 code implementation ICLR 2022 Rui Yang, Yiming Lu, Wenzhe Li, Hao Sun, Meng Fang, Yali Du, Xiu Li, Lei Han, Chongjie Zhang

In this paper, we revisit the theoretical property of GCSL -- optimizing a lower bound of the goal reaching objective, and extend GCSL as a novel offline goal-conditioned RL algorithm.

Offline RL Reinforcement Learning (RL) +1

MOORe: Model-based Offline-to-Online Reinforcement Learning

no code implementations25 Jan 2022 Yihuan Mao, Chao Wang, Bin Wang, Chongjie Zhang

With the success of offline reinforcement learning (RL), offline trained RL policies have the potential to be further improved when deployed online.

D4RL reinforcement-learning +1

Self-Organized Polynomial-Time Coordination Graphs

1 code implementation7 Dec 2021 Qianlan Yang, Weijun Dong, Zhizhou Ren, Jianhao Wang, Tonghan Wang, Chongjie Zhang

However, one critical challenge in this paradigm is the complexity of greedy action selection with respect to the factorized values.

Computational Efficiency Multi-agent Reinforcement Learning

Offline Reinforcement Learning with Value-based Episodic Memory

1 code implementation ICLR 2022 Xiaoteng Ma, Yiqin Yang, Hao Hu, Qihan Liu, Jun Yang, Chongjie Zhang, Qianchuan Zhao, Bin Liang

Offline reinforcement learning (RL) shows promise of applying RL to real-world problems by effectively utilizing previously collected data.

D4RL Offline RL +2

Offline Reinforcement Learning with Reverse Model-based Imagination

1 code implementation NeurIPS 2021 Jianhao Wang, Wenzhe Li, Haozhe Jiang, Guangxiang Zhu, Siyuan Li, Chongjie Zhang

These reverse imaginations provide informed data augmentation for model-free policy learning and enable conservative generalization beyond the offline dataset.

Data Augmentation Offline RL +2

On the Estimation Bias in Double Q-Learning

1 code implementation NeurIPS 2021 Zhizhou Ren, Guangxiang Zhu, Hao Hu, Beining Han, Jianglun Chen, Chongjie Zhang

Double Q-learning is a classical method for reducing overestimation bias, which is caused by taking maximum estimated values in the Bellman operation.

Q-Learning Value prediction

Safe Opponent-Exploitation Subgame Refinement

no code implementations29 Sep 2021 Mingyang Liu, Chengjie WU, Qihan Liu, Yansen Jing, Jun Yang, Pingzhong Tang, Chongjie Zhang

Search algorithms have been playing a vital role in the success of superhuman AI in both perfect information and imperfect information games.

Learning the Representation of Behavior Styles with Imitation Learning

no code implementations29 Sep 2021 Xiao Liu, Meng Wang, Zhaorong Wang, Yingfeng Chen, Yujing Hu, Changjie Fan, Chongjie Zhang

Imitation learning is one of the methods for reproducing expert demonstrations adaptively by learning a mapping between observations and actions.

Imitation Learning

Learning Homophilic Incentives in Sequential Social Dilemmas

no code implementations29 Sep 2021 Heng Dong, Tonghan Wang, Jiayuan Liu, Chi Han, Chongjie Zhang

Promoting cooperation among self-interested agents is a long-standing and interdisciplinary problem, but receives less attention in multi-agent reinforcement learning (MARL).

Multi-agent Reinforcement Learning

Context-Aware Sparse Deep Coordination Graphs

1 code implementation ICLR 2022 Tonghan Wang, Liang Zeng, Weijun Dong, Qianlan Yang, Yang Yu, Chongjie Zhang

Learning sparse coordination graphs adaptive to the coordination dynamics among agents is a long-standing problem in cooperative multi-agent learning.

graph construction Graph Learning +2

Active Hierarchical Exploration with Stable Subgoal Representation Learning

1 code implementation ICLR 2022 Siyuan Li, Jin Zhang, Jianhao Wang, Yang Yu, Chongjie Zhang

Although GCHRL possesses superior exploration ability by decomposing tasks via subgoals, existing GCHRL methods struggle in temporally extended tasks with sparse external rewards, since the high-level policy learning relies on external rewards.

Continuous Control Hierarchical Reinforcement Learning +1

Birds of a Feather Flock Together: A Close Look at Cooperation Emergence via Multi-Agent RL

no code implementations23 Apr 2021 Heng Dong, Tonghan Wang, Jiayuan Liu, Chi Han, Chongjie Zhang

We propose a novel learning framework to encourage homophilic incentives and show that it achieves stable cooperation in both SSDs of public goods and tragedy of the commons.

Multi-agent Reinforcement Learning

Generalizable Episodic Memory for Deep Reinforcement Learning

1 code implementation11 Mar 2021 Hao Hu, Jianing Ye, Guangxiang Zhu, Zhizhou Ren, Chongjie Zhang

Episodic memory-based methods can rapidly latch onto past successful strategies by a non-parametric memory and improve sample efficiency of traditional reinforcement learning.

Atari Games Continuous Control +2

DOP: Off-Policy Multi-Agent Decomposed Policy Gradients

no code implementations ICLR 2021 Yihan Wang, Beining Han, Tonghan Wang, Heng Dong, Chongjie Zhang

In this paper, we investigate causes that hinder the performance of MAPG algorithms and present a multi-agent decomposed policy gradient method (DOP).

Multi-agent Reinforcement Learning Starcraft +1

Learning Subgoal Representations with Slow Dynamics

no code implementations ICLR 2021 Siyuan Li, Lulu Zheng, Jianhao Wang, Chongjie Zhang

In goal-conditioned Hierarchical Reinforcement Learning (HRL), a high-level policy periodically sets subgoals for a low-level policy, and the low-level policy is trained to reach those subgoals.

Continuous Control Hierarchical Reinforcement Learning +1

Bridging Imagination and Reality for Model-Based Deep Reinforcement Learning

1 code implementation NeurIPS 2020 Guangxiang Zhu, Minghao Zhang, Honglak Lee, Chongjie Zhang

It maximizes the mutual information between imaginary and real trajectories so that the policy improvement learned from imaginary trajectories can be easily generalized to real trajectories.

Model-based Reinforcement Learning reinforcement-learning +1

RODE: Learning Roles to Decompose Multi-Agent Tasks

2 code implementations ICLR 2021 Tonghan Wang, Tarun Gupta, Anuj Mahajan, Bei Peng, Shimon Whiteson, Chongjie Zhang

Learning a role selector based on action effects makes role discovery much easier because it forms a bi-level learning hierarchy -- the role selector searches in a smaller role space and at a lower temporal resolution, while role policies learn in significantly reduced primitive action-observation spaces.

Clustering Starcraft +1

QPLEX: Duplex Dueling Multi-Agent Q-Learning

5 code implementations ICLR 2021 Jianhao Wang, Zhizhou Ren, Terry Liu, Yang Yu, Chongjie Zhang

This paper presents a novel MARL approach, called duPLEX dueling multi-agent Q-learning (QPLEX), which takes a duplex dueling network architecture to factorize the joint value function.

Decision Making Multi-agent Reinforcement Learning +3

Off-Policy Multi-Agent Decomposed Policy Gradients

1 code implementation24 Jul 2020 Yihan Wang, Beining Han, Tonghan Wang, Heng Dong, Chongjie Zhang

In this paper, we investigate causes that hinder the performance of MAPG algorithms and present a multi-agent decomposed policy gradient method (DOP).

Multi-agent Reinforcement Learning Starcraft +1

SOAC: The Soft Option Actor-Critic Architecture

no code implementations25 Jun 2020 Chenghao Li, Xiaoteng Ma, Chongjie Zhang, Jun Yang, Li Xia, Qianchuan Zhao

In these tasks, our approach learns a diverse set of options, each of whose state-action space has strong coherence.

Transfer Learning

Towards Understanding Cooperative Multi-Agent Q-Learning with Value Factorization

no code implementations NeurIPS 2021 Jianhao Wang, Zhizhou Ren, Beining Han, Jianing Ye, Chongjie Zhang

Value factorization is a popular and promising approach to scaling up multi-agent reinforcement learning in cooperative settings, which balances the learning scalability and the representational capacity of value functions.

counterfactual Multi-agent Reinforcement Learning +3

ROMA: Multi-Agent Reinforcement Learning with Emergent Roles

1 code implementation ICML 2020 Tonghan Wang, Heng Dong, Victor Lesser, Chongjie Zhang

In this paper, we synergize these two paradigms and propose a role-oriented MARL framework (ROMA).

Multiagent Systems

Influence-Based Multi-Agent Exploration

1 code implementation ICLR 2020 Tonghan Wang, Jianhao Wang, Yi Wu, Chongjie Zhang

We present two exploration methods: exploration via information-theoretic influence (EITI) and exploration via decision-theoretic influence (EDTI), by exploiting the role of interaction in coordinated behaviors of agents.

reinforcement-learning Reinforcement Learning (RL)

Learning Nearly Decomposable Value Functions Via Communication Minimization

1 code implementation ICLR 2020 Tonghan Wang, Jianhao Wang, Chongyi Zheng, Chongjie Zhang

Recently, value function factorization learning emerges as a promising way to address these challenges in collaborative multi-agent systems.


Hierarchical Reinforcement Learning with Advantage-Based Auxiliary Rewards

1 code implementation NeurIPS 2019 Siyuan Li, Rui Wang, Minxue Tang, Chongjie Zhang

In addition, we also theoretically prove that optimizing low-level skills with this auxiliary reward will increase the task return for the joint policy.

Hierarchical Reinforcement Learning reinforcement-learning +1

Object-Oriented Model Learning through Multi-Level Abstraction

no code implementations ICLR 2019 Guangxiang Zhu, Jianhao Wang, Zhizhou Ren, Chongjie Zhang

Object-based approaches for learning action-conditioned dynamics has demonstrated promise for generalization and interpretability.

Object Relational Reasoning +1

Object-Oriented Dynamics Learning through Multi-Level Abstraction

1 code implementation16 Apr 2019 Guangxiang Zhu, Jianhao Wang, Zhizhou Ren, Zichuan Lin, Chongjie Zhang

We also design a spatial-temporal relational reasoning mechanism for MAOP to support instance-level dynamics learning and handle partial observability.

Object Relational Reasoning +1

Convergence of Multi-Agent Learning with a Finite Step Size in General-Sum Games

no code implementations7 Mar 2019 Xinliang Song, Tonghan Wang, Chongjie Zhang

Learning in a multi-agent system is challenging because agents are simultaneously learning and the environment is not stationary, undermining convergence guarantees.

Towards Efficient Detection and Optimal Response against Sophisticated Opponents

no code implementations12 Sep 2018 Tianpei Yang, Zhaopeng Meng, Jianye Hao, Chongjie Zhang, Yan Zheng, Ze Zheng

This paper proposes a novel approach called Bayes-ToMoP which can efficiently detect the strategy of opponents using either stationary or higher-level reasoning strategies.

Multiagent Systems

Context-Aware Policy Reuse

no code implementations11 Jun 2018 Siyuan Li, Fangda Gu, Guangxiang Zhu, Chongjie Zhang

Transfer learning can greatly speed up reinforcement learning for a new task by leveraging policies of relevant tasks.

Transfer Learning

Object-Oriented Dynamics Predictor

1 code implementation NeurIPS 2018 Guangxiang Zhu, Zhiao Huang, Chongjie Zhang

Generalization has been one of the major challenges for learning dynamics models in model-based reinforcement learning.

Model-based Reinforcement Learning Object

An Optimal Online Method of Selecting Source Policies for Reinforcement Learning

no code implementations24 Sep 2017 Siyuan Li, Chongjie Zhang

In this paper, we develop an optimal online method to select source policies for reinforcement learning.

Q-Learning reinforcement-learning +3

Fairness in Multi-Agent Sequential Decision-Making

no code implementations NeurIPS 2014 Chongjie Zhang, Julie A. Shah

We develop a simple linear programming approach and a more scalable game-theoretic approach for computing an optimal fairness policy.

Decision Making Fairness

Cannot find the paper you are looking for? You can Submit a new open access paper.