1 code implementation • 19 Apr 2022 • Wei Chen, Zhiwei Li, Hongyi Fang, Qianyuan Yao, Cheng Zhong, Jianye Hao, Qi Zhang, Xuanjing Huang, Jiajie Peng, Zhongyu Wei
In recent years, interest has arisen in using machine learning to improve the efficiency of automatic medical consultation and enhance patient experience.
no code implementations • 6 Apr 2022 • Tong Sang, Hongyao Tang, Yi Ma, Jianye Hao, Yan Zheng, Zhaopeng Meng, Boyan Li, Zhen Wang
In online adaptation phase, with the environment context inferred from few experiences collected in new environments, the policy is optimized by gradient ascent with respect to the PDVF.
no code implementations • 24 Mar 2022 • Bowen Wang, Guibao Shen, Dong Li, Jianye Hao, Wulong Liu, Yu Huang, HongZhong Wu, Yibo Lin, Guangyong Chen, Pheng Ann Heng
Precise congestion prediction from a placement solution plays a crucial role in circuit placement.
no code implementations • 16 Mar 2022 • Pengyi Li, Hongyao Tang, Tianpei Yang, Xiaotian Hao, Tong Sang, Yan Zheng, Jianye Hao, Matthew E. Taylor, Zhen Wang
Learning to collaborate is critical in multi-agent reinforcement learning (MARL).
1 code implementation • 16 Mar 2022 • Jian Zhao, Youpeng Zhao, Weixun Wang, Mingyu Yang, Xunhan Hu, Wengang Zhou, Jianye Hao, Houqiang Li
To the best of our knowledge, this work is the first to study the unexpected crashes in the multi-agent system.
Multi-agent Reinforcement Learning
reinforcement-learning
+2
1 code implementation • 10 Mar 2022 • Xiaotian Hao, Weixun Wang, Hangyu Mao, Yaodong Yang, Dong Li, Yan Zheng, Zhen Wang, Jianye Hao
Multi-agent reinforcement learning suffers from poor sample efficiency due to the exponential growth of the state-action space.
no code implementations • 4 Mar 2022 • Minghuan Liu, Zhengbang Zhu, Yuzheng Zhuang, Weinan Zhang, Jianye Hao, Yong Yu, Jun Wang
Recent progress in state-only imitation learning extends the scope of applicability of imitation learning to real-world settings by relieving the need for observing expert actions.
no code implementations • 17 Feb 2022 • Mengyue Yang, Xinyu Cai, Furui Liu, Xu Chen, Zhitang Chen, Jianye Hao, Jun Wang
It is evidence that representation learning can improve model's performance over multiple downstream tasks in many real-world scenarios, such as image classification and recommender systems.
no code implementations • 9 Feb 2022 • Jian Zhao, Yue Zhang, Xunhan Hu, Weixun Wang, Wengang Zhou, Jianye Hao, Jiangcheng Zhu, Houqiang Li
In cooperative multi-agent systems, agents jointly take actions and receive a team reward instead of individual rewards.
no code implementations • 19 Jan 2022 • Jianye Hao, Jiawen Lu, Xijun Li, Xialiang Tong, Xiang Xiang, Mingxuan Yuan, Hankz Hankui Zhuo
The Dynamic Pickup and Delivery Problem (DPDP) is an essential problem within the logistics domain.
no code implementations • 16 Jan 2022 • Mengyue Yang, Guohao Cai, Furui Liu, Zhenhua Dong, Xiuqiang He, Jianye Hao, Jun Wang, Xu Chen
To alleviate these problems, in this paper, we propose a novel debiased recommendation framework based on user feature balancing.
no code implementations • 24 Dec 2021 • Claire Glanois, Paul Weng, Matthieu Zimmer, Dong Li, Tianpei Yang, Jianye Hao, Wulong Liu
To that aim, we distinguish interpretability (as a property of a model) and explainability (as a post-hoc operation, with the intervention of a proxy) and discuss them in the context of RL with an emphasis on the former notion.
1 code implementation • 6 Dec 2021 • Cong Wang, Tianpei Yang, Jianye Hao, Yan Zheng, Hongyao Tang, Fazl Barez, Jinyi Liu, Jiajie Peng, Haiyin Piao, Zhixiao Sun
To reduce the model error, previous works use a single well-designed network to fit the entire environment dynamics, which treats the environment dynamics as a black box.
no code implementations • NeurIPS 2021 • Yi Ma, Xiaotian Hao, Jianye Hao, Jiawen Lu, Xing Liu, Tong Xialiang, Mingxuan Yuan, Zhigang Li, Jie Tang, Zhaopeng Meng
To address this problem, existing methods partition the overall DPDP into fixed-size sub-problems by caching online generated orders and solve each sub-problem, or on this basis to utilize the predicted future orders to optimize each sub-problem further.
1 code implementation • NeurIPS 2021 • Chenyang Wu, Guoyu Yang, Zongzhang Zhang, Yang Yu, Dong Li, Wulong Liu, Jianye Hao
A belief is a distribution of states representing state uncertainty.
no code implementations • NeurIPS 2021 • Yao Mu, Yuzheng Zhuang, Bin Wang, Guangxiang Zhu, Wulong Liu, Jianyu Chen, Ping Luo, Shengbo Li, Chongjie Zhang, Jianye Hao
Model-based reinforcement learning aims to improve the sample efficiency of policy learning by modeling the dynamics of the environment.
no code implementations • ICLR 2022 • Changmin Yu, Dong Li, Jianye Hao, Jun Wang, Neil Burgess
We propose learning via retracing, a novel self-supervised approach for learning the state representation (and the associated dynamics model) for reinforcement learning tasks.
no code implementations • 19 Nov 2021 • Tong Sang, Hongyao Tang, Jianye Hao, Yan Zheng, Zhaopeng Meng
Such a reconstruction exploits the underlying structure of value matrix to improve the value approximation, thus leading to a more efficient learning process of value function.
no code implementations • 18 Nov 2021 • Xuejing Zheng, Chao Yu, Chen Chen, Jianye Hao, Hankz Hankui Zhuo
In this paper, we propose Lifelong reinforcement learning with Sequential linear temporal logic formulas and Reward Machines (LSRM), which enables an agent to leverage previously learned knowledge to fasten learning of logically specified tasks.
no code implementations • 17 Nov 2021 • Hangyu Mao, Chao Wang, Xiaotian Hao, Yihuan Mao, Yiming Lu, Chengjie WU, Jianye Hao, Dong Li, Pingzhong Tang
The MineRL competition is designed for the development of reinforcement learning and imitation learning algorithms that can efficiently leverage human demonstrations to drastically reduce the number of environment interactions needed to solve the complex \emph{ObtainDiamond} task with sparse rewards.
1 code implementation • NeurIPS 2021 • Chenjia Bai, Lingxiao Wang, Lei Han, Animesh Garg, Jianye Hao, Peng Liu, Zhaoran Wang
Exploration methods based on pseudo-count of transitions or curiosity of dynamics have achieved promising results in solving reinforcement learning with sparse rewards.
1 code implementation • NeurIPS 2021 • Danruo Deng, Guangyong Chen, Jianye Hao, Qiong Wang, Pheng-Ann Heng
The backpropagation networks are notably susceptible to catastrophic forgetting, where networks tend to forget previously learned skills upon learning new ones.
1 code implementation • 8 Oct 2021 • Shiyu Huang, Bin Wang, Dong Li, Jianye Hao, Ting Chen, Jun Zhu
In this work, we propose a new algorithm for circuit routing, named Ranking Cost, which innovatively combines search-based methods (i. e., A* algorithm) and learning-based methods (i. e., Evolution Strategies) to form an efficient and trainable router.
no code implementations • ICLR 2022 • Pengjie Gu, Mengchen Zhao, Jianye Hao, Bo An
Autonomous agents often need to work together as a team to accomplish complex cooperative tasks.
no code implementations • 29 Sep 2021 • Jinyi Liu, Zhi Wang, Yan Zheng, Jianye Hao, Junjie Ye, Chenjia Bai, Pengyi Li
Many exploration strategies are built upon the optimism in the face of the uncertainty (OFU) principle for reinforcement learning.
no code implementations • 29 Sep 2021 • Hangyu Mao, Jianye Hao, Dong Li, Jun Wang, Weixun Wang, Xiaotian Hao, Bin Wang, Kun Shao, Zhen Xiao, Wulong Liu
In contrast, we formulate an \emph{explicit} credit assignment problem where each agent gives its suggestion about how to weight individual Q-values to explicitly maximize the joint Q-value, besides guaranteeing the Bellman optimality of the joint Q-value.
no code implementations • 29 Sep 2021 • Mengyue Yang, Furui Liu, Xu Chen, Zhitang Chen, Jianye Hao, Jun Wang
In many real-world scenarios, such as image classification and recommender systems, it is evidence that representation learning can improve model's performance over multiple downstream tasks.
no code implementations • 29 Sep 2021 • Pengjie Gu, Mengchen Zhao, Chen Chen, Dong Li, Jianye Hao, Bo An
Offline reinforcement learning is a promising approach for practical applications since it does not require interactions with real-world environments.
no code implementations • NeurIPS 2021 • Minghuan Liu, Zhengbang Zhu, Yuzheng Zhuang, Weinan Zhang, Jian Shen, Jianye Hao, Yong Yu, Jun Wang
State-only imitation learning (SOIL) enables agents to learn from massive demonstrations without explicit action or reward information.
no code implementations • 14 Sep 2021 • Tianpei Yang, Hongyao Tang, Chenjia Bai, Jinyi Liu, Jianye Hao, Zhaopeng Meng, Peng Liu, Zhen Wang
In this paper, we conduct a comprehensive survey on existing exploration methods in DRL and deep MARL for the purpose of providing understandings and insights on the critical problems and solutions.
no code implementations • ICLR 2022 • Boyan Li, Hongyao Tang, Yan Zheng, Jianye Hao, Pengyi Li, Zhen Wang, Zhaopeng Meng, Li Wang
Discrete-continuous hybrid action space is a natural setting in many practical problems, such as robot control and game AI.
1 code implementation • 24 Aug 2021 • Xidong Feng, Chen Chen, Dong Li, Mengchen Zhao, Jianye Hao, Jun Wang
Meta learning, especially gradient based one, can be adopted to tackle this problem by learning initial parameters of the model and thus allowing fast adaptation to a specific task from limited data examples.
no code implementations • 14 Aug 2021 • Yankai Chen, Menglin Yang, Yingxue Zhang, Mengchen Zhao, Ziqiao Meng, Jianye Hao, Irwin King
Aiming to alleviate data sparsity and cold-start problems of traditional recommender systems, incorporating knowledge graphs (KGs) to supplement auxiliary information has recently gained considerable attention.
no code implementations • Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval 2021 • Jianghao Lin, Weiwen Liu, Xinyi Dai, Weinan Zhang, Shuai Li, Ruiming Tang, Xiuqiang He, Jianye Hao, Yong Yu
To better exploit search logs and model users' behavior patterns, numerous click models are proposed to extract users' implicit interaction feedback.
no code implementations • 2 Jun 2021 • Yunqi Wang, Furui Liu, Zhitang Chen, Qing Lian, Shoubo Hu, Jianye Hao, Yik-Chung Wu
Domain generalization aims to learn knowledge invariant across different distributions while semantically meaningful for downstream tasks from multiple source domains, to improve the model's generalization ability on unseen target domains.
no code implementations • 1 Jun 2021 • Tianze Zhou, Fubiao Zhang, Kun Shao, Kai Li, Wenhan Huang, Jun Luo, Weixun Wang, Yaodong Yang, Hangyu Mao, Bin Wang, Dong Li, Wulong Liu, Jianye Hao
In addition, we use a novel agent network named Population Invariant agent with Transformer (PIT) to realize the coordination transfer in more varieties of scenarios.
no code implementations • 28 May 2021 • Zeren Huang, Kerong Wang, Furui Liu, Hui-Ling Zhen, Weinan Zhang, Mingxuan Yuan, Jianye Hao, Yong Yu, Jun Wang
In the online A/B testing of the product planning problems with more than $10^7$ variables and constraints daily, Cut Ranking has achieved the average speedup ratio of 12. 42% over the production solver without any accuracy loss of solution.
1 code implementation • 14 May 2021 • Xiaoqiang Wang, Yali Du, Shengyu Zhu, Liangjun Ke, Zhitang Chen, Jianye Hao, Jun Wang
It is a long-standing question to discover causal relations among a set of variables in many empirical sciences.
1 code implementation • 13 May 2021 • Chenjia Bai, Lingxiao Wang, Lei Han, Jianye Hao, Animesh Garg, Peng Liu, Zhaoran Wang
In this paper, we propose a principled exploration method for DRL through Optimistic Bootstrapping and Backward Induction (OB2I).
1 code implementation • 13 Apr 2021 • Xinyi Dai, Jianghao Lin, Weinan Zhang, Shuai Li, Weiwen Liu, Ruiming Tang, Xiuqiang He, Jianye Hao, Jun Wang, Yong Yu
Modern information retrieval systems, including web search, ads placement, and recommender systems, typically rely on learning from user feedback.
no code implementations • 15 Mar 2021 • Zhihao Ma, Yuzheng Zhuang, Paul Weng, Hankz Hankui Zhuo, Dong Li, Wulong Liu, Jianye Hao
To address this challenge and improve the transparency, we propose a Neural Symbolic Reinforcement Learning framework by introducing symbolic logic into DRL.
no code implementations • ICLR Workshop SSL-RL 2021 • Changmin Yu, Dong Li, Hangyu Mao, Jianye Hao, Neil Burgess
Representation learning is a popular approach for reinforcement learning (RL) tasks with partially observable Markov decision processes.
no code implementations • 3 Mar 2021 • Chen Chen, Hongyao Tang, Jianye Hao, Wulong Liu, Zhaopeng Meng
We propose Nested Policy Iteration as a general training algorithm for PIC-augmented policy which ensures monotonically non-decreasing updates under some mild conditions.
1 code implementation • 3 Mar 2021 • Hongyao Tang, Jianye Hao, Guangyong Chen, Pengfei Chen, Chen Chen, Yaodong Yang, Luo Zhang, Wulong Liu, Zhaopeng Meng
Value function is the central notion of Reinforcement Learning (RL).
no code implementations • 1 Jan 2021 • Xiangkun He, Jianye Hao, Dong Li, Bin Wang, Wulong Liu
Thirdly, the agent’s learning process is regarded as a black-box, and the comprehensive metric we proposed is computed after each episode of training, then a Bayesian optimization (BO) algorithm is adopted to guide the agent to evolve towards improving the quality of the approximated Pareto frontier.
no code implementations • 1 Jan 2021 • Shiyu Huang, Bin Wang, Dong Li, Jianye Hao, Jun Zhu, Ting Chen
In our method, we introduce a new set of variables called cost maps, which can help the A* router to find out proper paths to achieve the global object.
no code implementations • 1 Jan 2021 • Peng Zhang, Furui Liu, Zhitang Chen, Jianye Hao, Jun Wang
Reinforcement Learning (RL) has shown great potential to deal with sequential decision-making problems.
no code implementations • 1 Jan 2021 • Chenjia Bai, Lingxiao Wang, Peng Liu, Zhaoran Wang, Jianye Hao, Yingnan Zhao
However, such an approach is challenging in developing practical exploration algorithms for Deep Reinforcement Learning (DRL).
no code implementations • 1 Jan 2021 • Yao Mu, Yuzheng Zhuang, Bin Wang, Wulong Liu, Shengbo Eben Li, Jianye Hao
The latent dynamics model summarizes an agent’s high dimensional experiences in a compact way.
no code implementations • 1 Jan 2021 • Yizheng Hu, Kun Shao, Dong Li, Jianye Hao, Wulong Liu, Yaodong Yang, Jun Wang, Zhanxing Zhu
Therefore, to achieve robust CMARL, we introduce novel strategies to encourage agents to learn correlated equilibrium while maximally preserving the convenience of the decentralized execution.
no code implementations • 1 Jan 2021 • Jinyi Liu, Zhi Wang, Jianye Hao, Yan Zheng
Recently, the principle of optimism in the face of (aleatoric and epistemic) uncertainty has been utilized to design efficient exploration strategies for Reinforcement Learning (RL).
no code implementations • 1 Jan 2021 • Zhihao Ma, Yuzheng Zhuang, Paul Weng, Dong Li, Kun Shao, Wulong Liu, Hankz Hankui Zhuo, Jianye Hao
Recent progress in deep reinforcement learning (DRL) can be largely attributed to the use of neural networks.
Hierarchical Reinforcement Learning
reinforcement-learning
+1
no code implementations • 13 Nov 2020 • Jiajun Fan, He Ba, Xian Guo, Jianye Hao
Extensive experiments demonstrate that Critic PI2 achieved a new state of the art in a range of challenging continuous domains.
no code implementations • NeurIPS 2020 • Yujing Hu, Weixun Wang, Hangtian Jia, Yixiang Wang, Yingfeng Chen, Jianye Hao, Feng Wu, Changjie Fan
In this paper, we consider the problem of adaptively utilizing a given shaping reward function.
3 code implementations • 19 Oct 2020 • Ming Zhou, Jun Luo, Julian Villella, Yaodong Yang, David Rusu, Jiayu Miao, Weinan Zhang, Montgomery Alban, Iman Fadakar, Zheng Chen, Aurora Chongxi Huang, Ying Wen, Kimia Hassanzadeh, Daniel Graves, Dong Chen, Zhengbang Zhu, Nhat Nguyen, Mohamed Elsayed, Kun Shao, Sanjeevan Ahilan, Baokuan Zhang, Jiannan Wu, Zhengang Fu, Kasra Rezaee, Peyman Yadmellat, Mohsen Rohani, Nicolas Perez Nieves, Yihan Ni, Seyedershad Banijamali, Alexander Cowen Rivers, Zheng Tian, Daniel Palenicek, Haitham Bou Ammar, Hongbo Zhang, Wulong Liu, Jianye Hao, Jun Wang
We open-source the SMARTS platform and the associated benchmark tasks and evaluation metrics to encourage and empower research on multi-agent learning for autonomous driving.
no code implementations • NeurIPS 2021 • Hongyao Tang, Zhaopeng Meng, Jianye Hao, Chen Chen, Daniel Graves, Dong Li, Changmin Yu, Hangyu Mao, Wulong Liu, Yaodong Yang, Wenyuan Tao, Li Wang
We study Policy-extended Value Function Approximator (PeVFA) in Reinforcement Learning (RL), which extends conventional value function approximator (VFA) to take as input not only the state (and action) but also an explicit policy representation.
no code implementations • 10 Oct 2020 • Guangzheng Hu, Yuanheng Zhu, Dongbin Zhao, Mengchen Zhao, Jianye Hao
Then the design of the event-triggered strategy is formulated as a constrained Markov decision problem, and reinforcement learning finds the best communication protocol that satisfies the limited bandwidth constraint.
Multiagent Systems
no code implementations • 29 Sep 2020 • Haotian Fu, Hongyao Tang, Jianye Hao, Chen Chen, Xidong Feng, Dong Li, Wulong Liu
How to collect informative trajectories of which the corresponding context reflects the specification of tasks?
no code implementations • 28 Sep 2020 • Hongyao Tang, Zhaopeng Meng, Jianye Hao, Chen Chen, Daniel Graves, Dong Li, Wulong Liu, Yaodong Yang
The value function lies in the heart of Reinforcement Learning (RL), which defines the long-term evaluation of a policy in a given state.
no code implementations • 28 Sep 2020 • Tianpei Yang, Jianye Hao, Weixun Wang, Hongyao Tang, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Yujing Hu, Yingfeng Chen, Changjie Fan
In many cases, each agent's experience is inconsistent with each other which causes the option-value estimation to oscillate and to become inaccurate.
no code implementations • 21 Sep 2020 • Jun-Jie Wang, Qichao Zhang, Dongbin Zhao, Mengchen Zhao, Jianye Hao
Existing model-based value expansion methods typically leverage a world model for value estimation with a fixed rollout horizon to assist policy learning.
no code implementations • ICML 2020 • Xiaotian Hao, Zhaoqing Peng, Yi Ma, Guan Wang, Junqi Jin, Jianye Hao, Shan Chen, Rongquan Bai, Mingzhou Xie, Miao Xu, Zhenzhe Zheng, Chuan Yu, Han Li, Jian Xu, Kun Gai
In E-commerce, advertising is essential for merchants to reach their target users.
no code implementations • 19 May 2020 • Cong Fei, Bin Wang, Yuzheng Zhuang, Zongzhang Zhang, Jianye Hao, Hongbo Zhang, Xuewu Ji, Wulong Liu
Generative adversarial imitation learning (GAIL) has shown promising results by taking advantage of generative adversarial nets, especially in the field of robot learning.
no code implementations • 14 May 2020 • Jianwen Sun, Yan Zheng, Jianye Hao, Zhaopeng Meng, Yang Liu
With the increasing popularity of electric vehicles, distributed energy generation and storage facilities in smart grid systems, an efficient Demand-Side Management (DSM) is urgent for energy savings and peak loads reduction.
no code implementations • 9 May 2020 • Xiaotian Hao, Junqi Jin, Jianye Hao, Jin Li, Weixun Wang, Yi Ma, Zhenzhe Zheng, Han Li, Jian Xu, Kun Gai
Bipartite b-matching is fundamental in algorithm design, and has been widely applied into economic markets, labor markets, etc.
1 code implementation • CVPR 2021 • Mengyue Yang, Furui Liu, Zhitang Chen, Xinwei Shen, Jianye Hao, Jun Wang
Learning disentanglement aims at finding a low dimensional representation which consists of multiple explanatory and generative factors of the observational data.
no code implementations • 19 Feb 2020 • Tianpei Yang, Jianye Hao, Zhaopeng Meng, Zongzhang Zhang, Yujing Hu, Yingfeng Cheng, Changjie Fan, Weixun Wang, Wulong Liu, Zhaodong Wang, Jiajie Peng
Transfer Learning (TL) has shown great potential to accelerate Reinforcement Learning (RL) by leveraging prior knowledge from past learned policies of relevant tasks.
no code implementations • 18 Feb 2020 • Peng Zhang, Jianye Hao, Weixun Wang, Hongyao Tang, Yi Ma, Yihai Duan, Yan Zheng
Our framework consists of a fuzzy rule controller to represent human knowledge and a refine module to fine-tune suboptimal prior knowledge.
no code implementations • 3 Dec 2019 • Hangyu Mao, Wulong Liu, Jianye Hao, Jun Luo, Dong Li, Zhengchao Zhang, Jun Wang, Zhen Xiao
Social psychology and real experiences show that cognitive consistency plays an important role to keep human society in order: if people have a more consistent cognition about their environments, they are more likely to achieve better cooperation.
no code implementations • 25 Nov 2019 • Yong Liu, Weixun Wang, Yujing Hu, Jianye Hao, Xingguo Chen, Yang Gao
Traditional methods attempt to use pre-defined rules to capture the interaction relationship between agents.
no code implementations • 14 Nov 2019 • Yizhen Dong, Peixin Zhang, Jingyi Wang, Shuang Liu, Jun Sun, Jianye Hao, Xinyu Wang, Li Wang, Jin Song Dong, Dai Ting
In this work, we conduct an empirical study to evaluate the relationship between coverage, robustness and attack/defense metrics for DNN.
no code implementations • 30 Sep 2019 • Haotian Fu, Hongyao Tang, Jianye Hao, Wulong Liu, Chen Chen
Most meta reinforcement learning (meta-RL) methods learn to adapt to new tasks by directly optimizing the parameters of policies over primitive action space.
no code implementations • 25 Sep 2019 • Haotian Fu, Hongyao Tang, Jianye Hao
Meta reinforcement learning (meta-RL) is able to accelerate the acquisition of new tasks by learning from past experience.
no code implementations • 6 Sep 2019 • Weixun Wang, Tianpei Yang, Yong liu, Jianye Hao, Xiaotian Hao, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao
In this paper, we design a novel Dynamic Multiagent Curriculum Learning (DyMA-CL) to solve large-scale problems by starting from learning on a multiagent scenario with a small size and progressively increasing the number of agents.
1 code implementation • 5 Sep 2019 • Yu Chen, Yingfeng Chen, Zhipeng Hu, Tianpei Yang, Changjie Fan, Yang Yu, Jianye Hao
Transfer learning (TL) is a promising way to improve the sample efficiency of reinforcement learning.
1 code implementation • ICLR 2020 • Weixun Wang, Tianpei Yang, Yong liu, Jianye Hao, Xiaotian Hao, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao
ASN characterizes different actions' influence on other agents using neural networks based on the action semantics between them.
no code implementations • 21 Jul 2019 • Yi Ma, Jianye Hao, Yaodong Yang, Han Li, Junqi Jin, Guangyong Chen
Our approach can work directly on directed graph data in semi-supervised nodes classification tasks.
no code implementations • 27 May 2019 • Hongyao Tang, Jianye Hao, Guangyong Chen, Pengfei Chen, Zhaopeng Meng, Yaodong Yang, Li Wang
Value functions are crucial for model-free Reinforcement Learning (RL) to obtain a policy implicitly or guide the policy updates.
no code implementations • 12 Mar 2019 • Haotian Fu, Hongyao Tang, Jianye Hao, Zihan Lei, Yingfeng Chen, Changjie Fan
Deep Reinforcement Learning (DRL) has been applied to address a variety of cooperative multi-agent problems with either discrete action spaces or continuous action spaces.
no code implementations • NeurIPS 2018 • Yan Zheng, Zhaopeng Meng, Jianye Hao, Zongzhang Zhang, Tianpei Yang, Changjie Fan
In multiagent domains, coping with non-stationary agents that change behaviors from time to time is a challenging problem, where an agent is usually required to be able to quickly detect the other agent's policy during online interaction, and then adapt its own policy accordingly.
no code implementations • 25 Sep 2018 • Hongyao Tang, Jianye Hao, Tangjie Lv, Yingfeng Chen, Zongzhang Zhang, Hangtian Jia, Chunxu Ren, Yan Zheng, Zhaopeng Meng, Changjie Fan, Li Wang
Besides, we propose a new experience replay mechanism to alleviate the issue of the sparse transitions at the high level of abstraction and the non-stationarity of multiagent learning.
no code implementations • 18 Sep 2018 • Chengwei Zhang, Xiaohong Li, Jianye Hao, Siqi Chen, Karl Tuyls, Zhiyong Feng, Wanli Xue, Rong Chen
Although many reinforcement learning methods have been proposed for learning the optimal solutions in single-agent continuous-action domains, multiagent coordination domains with continuous actions have received relatively few investigations.
no code implementations • 12 Sep 2018 • Tianpei Yang, Zhaopeng Meng, Jianye Hao, Chongjie Zhang, Yan Zheng, Ze Zheng
This paper proposes a novel approach called Bayes-ToMoP which can efficiently detect the strategy of opponents using either stationary or higher-level reasoning strategies.
Multiagent Systems
no code implementations • 10 Sep 2018 • Weixun Wang, Junqi Jin, Jianye Hao, Chunjie Chen, Chuan Yu, Wei-Nan Zhang, Jun Wang, Xiaotian Hao, Yixi Wang, Han Li, Jian Xu, Kun Gai
In this paper, we investigate the problem of advertising with adaptive exposure: can we dynamically determine the number and positions of ads for each user visit under certain business constraints so that the platform revenue can be increased?
no code implementations • 13 May 2018 • Hongyao Tang, Li Wang, Zan Wang, Tim Baarslag, Jianye Hao
Multiagent coordination in cooperative multiagent systems (MASs) has been widely studied in both fixed-agent repeated interaction setting and the static social learning framework.
no code implementations • 1 May 2018 • Takumi Akazaki, Shuang Liu, Yoriyuki Yamagata, Yihai Duan, Jianye Hao
With the rapid development of software and distributed computing, Cyber-Physical Systems (CPS) are widely adopted in many application areas, e. g., smart grid, autonomous automobile.
no code implementations • 8 Mar 2018 • Chengwei Zhang, Xiaohong Li, Jianye Hao, Siqi Chen, Karl Tuyls, Wanli Xue
In multiagent environments, the capability of learning is important for an agent to behave appropriately in face of unknown opponents and dynamic environment.
no code implementations • 1 Mar 2018 • Weixun Wang, Jianye Hao, Yixi Wang, Matthew Taylor
We introduce a Sequential Prisoner's Dilemma (SPD) game to better capture the aforementioned characteristics.
no code implementations • 23 Feb 2018 • Yan Zheng, Jianye Hao, Zongzhang Zhang
Recently, multiagent deep reinforcement learning (DRL) has received increasingly wide attention.
no code implementations • 13 Jan 2016 • Fengyuan Zhu, Guangyong Chen, Jianye Hao, Pheng-Ann Heng
This paper addresses this problem and proposes a novel blind image denoising algorithm to recover the clean image from noisy one with the unknown noise model.