Search Results for author: Zongzhang Zhang

Found 17 papers, 4 papers with code

Language Model Self-improvement by Reinforcement Learning Contemplation

no code implementations23 May 2023 Jing-Cheng Pang, Pengyuan Wang, Kaiyuan Li, Xiong-Hui Chen, Jiacheng Xu, Zongzhang Zhang, Yang Yu

We demonstrate that SIRLC can be applied to various NLP tasks, such as reasoning problems, text generation, and machine translation.

Language Modelling Machine Translation +3

Robust Multi-agent Communication via Multi-view Message Certification

no code implementations7 May 2023 Lei Yuan, Tao Jiang, Lihe Li, Feng Chen, Zongzhang Zhang, Yang Yu

Many multi-agent scenarios require message sharing among agents to promote coordination, hastening the robustness of multi-agent communication when policies are deployed in a message perturbation environment.

How To Guide Your Learner: Imitation Learning with Active Adaptive Expert Involvement

1 code implementation3 Mar 2023 Xu-Hui Liu, Feng Xu, Xinyu Zhang, Tianyuan Liu, Shengyi Jiang, Ruifeng Chen, Zongzhang Zhang, Yang Yu

In this paper, we propose a novel active imitation learning framework based on a teacher-student interaction model, in which the teacher's goal is to identify the best teaching behavior and actively affect the student's learning process.

Atari Games Imitation Learning

Multi-agent Dynamic Algorithm Configuration

1 code implementation13 Oct 2022 Ke Xue, Jiacheng Xu, Lei Yuan, Miqing Li, Chao Qian, Zongzhang Zhang, Yang Yu

MA-DAC formulates the dynamic configuration of a complex algorithm with multiple types of hyperparameters as a contextual multi-agent Markov decision process and solves it by a cooperative multi-agent RL (MARL) algorithm.

Multi-Armed Bandits Reinforcement Learning (RL)

Deep Anomaly Detection and Search via Reinforcement Learning

no code implementations31 Aug 2022 Chao Chen, Dawei Wang, Feng Mao, Zongzhang Zhang, Yang Yu

Semi-supervised Anomaly Detection (AD) is a kind of data mining task which aims at learning features from partially-labeled datasets to help detect outliers.

Ensemble Learning reinforcement-learning +3

Model Generation with Provable Coverability for Offline Reinforcement Learning

no code implementations1 Jun 2022 Chengxing Jia, Hao Yin, Chenxiao Gao, Tian Xu, Lei Yuan, Zongzhang Zhang, Yang Yu

Model-based offline optimization with dynamics-aware policy provides a new perspective for policy learning and out-of-distribution generalization, where the learned policy could adapt to different dynamics enumerated at the training stage.

Offline RL Out-of-Distribution Generalization +2

Multi-Agent Policy Transfer via Task Relationship Modeling

no code implementations9 Mar 2022 Rongjun Qin, Feng Chen, Tonghan Wang, Lei Yuan, Xiaoran Wu, Zongzhang Zhang, Chongjie Zhang, Yang Yu

We demonstrate that the task representation can capture the relationship among tasks, and can generalize to unseen tasks.

Transfer Learning

Cross-modal Domain Adaptation for Cost-Efficient Visual Reinforcement Learning

1 code implementation NeurIPS 2021 Xiong-Hui Chen, Shengyi Jiang, Feng Xu, Zongzhang Zhang, Yang Yu

Experiments on MuJoCo and Hand Manipulation Suite tasks show that the agents deployed with our method achieve similar performance as it has in the source domain, while those deployed with previous methods designed for same-modal domain adaptation suffer a larger performance gap.

Domain Adaptation reinforcement-learning +1

Triple-GAIL: A Multi-Modal Imitation Learning Framework with Generative Adversarial Nets

no code implementations19 May 2020 Cong Fei, Bin Wang, Yuzheng Zhuang, Zongzhang Zhang, Jianye Hao, Hongbo Zhang, Xuewu Ji, Wulong Liu

Generative adversarial imitation learning (GAIL) has shown promising results by taking advantage of generative adversarial nets, especially in the field of robot learning.

Autonomous Vehicles Data Augmentation +1

Efficient Deep Reinforcement Learning via Adaptive Policy Transfer

no code implementations19 Feb 2020 Tianpei Yang, Jianye Hao, Zhaopeng Meng, Zongzhang Zhang, Yujing Hu, Yingfeng Cheng, Changjie Fan, Weixun Wang, Wulong Liu, Zhaodong Wang, Jiajie Peng

Transfer Learning (TL) has shown great potential to accelerate Reinforcement Learning (RL) by leveraging prior knowledge from past learned policies of relevant tasks.

reinforcement-learning Reinforcement Learning (RL) +1

Monte-Carlo Tree Search for Policy Optimization

no code implementations23 Dec 2019 Xiaobai Ma, Katherine Driggs-Campbell, Zongzhang Zhang, Mykel J. Kochenderfer

Gradient-based methods are often used for policy optimization in deep reinforcement learning, despite being vulnerable to local optima and saddle points.

reinforcement-learning Reinforcement Learning (RL)

A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents

no code implementations NeurIPS 2018 Yan Zheng, Zhaopeng Meng, Jianye Hao, Zongzhang Zhang, Tianpei Yang, Changjie Fan

In multiagent domains, coping with non-stationary agents that change behaviors from time to time is a challenging problem, where an agent is usually required to be able to quickly detect the other agent's policy during online interaction, and then adapt its own policy accordingly.

Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction

no code implementations25 Sep 2018 Hongyao Tang, Jianye Hao, Tangjie Lv, Yingfeng Chen, Zongzhang Zhang, Hangtian Jia, Chunxu Ren, Yan Zheng, Zhaopeng Meng, Changjie Fan, Li Wang

Besides, we propose a new experience replay mechanism to alleviate the issue of the sparse transitions at the high level of abstraction and the non-stationarity of multiagent learning.

reinforcement-learning Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.