Search Results for author: Jianye Hao

Found 91 papers, 17 papers with code

A Benchmark for Automatic Medical Consultation System: Frameworks, Tasks and Datasets

1 code implementation19 Apr 2022 Wei Chen, Zhiwei Li, Hongyi Fang, Qianyuan Yao, Cheng Zhong, Jianye Hao, Qi Zhang, Xuanjing Huang, Jiajie Peng, Zhongyu Wei

In recent years, interest has arisen in using machine learning to improve the efficiency of automatic medical consultation and enhance patient experience.

Dialogue Act Classification Dialogue Understanding +2

PAnDR: Fast Adaptation to New Environments from Offline Experiences via Decoupling Policy and Environment Representations

no code implementations6 Apr 2022 Tong Sang, Hongyao Tang, Yi Ma, Jianye Hao, Yan Zheng, Zhaopeng Meng, Boyan Li, Zhen Wang

In online adaptation phase, with the environment context inferred from few experiences collected in new environments, the policy is optimized by gradient ascent with respect to the PDVF.

Contrastive Learning Decision Making

Plan Your Target and Learn Your Skills: Transferable State-Only Imitation Learning via Decoupled Policy Optimization

no code implementations4 Mar 2022 Minghuan Liu, Zhengbang Zhu, Yuzheng Zhuang, Weinan Zhang, Jianye Hao, Yong Yu, Jun Wang

Recent progress in state-only imitation learning extends the scope of applicability of imitation learning to real-world settings by relieving the need for observing expert actions.

Imitation Learning Transfer Learning

Generalizable Information Theoretic Causal Representation

no code implementations17 Feb 2022 Mengyue Yang, Xinyu Cai, Furui Liu, Xu Chen, Zhitang Chen, Jianye Hao, Jun Wang

It is evidence that representation learning can improve model's performance over multiple downstream tasks in many real-world scenarios, such as image classification and recommender systems.

Image Classification Recommendation Systems +1

Revisiting QMIX: Discriminative Credit Assignment by Gradient Entropy Regularization

no code implementations9 Feb 2022 Jian Zhao, Yue Zhang, Xunhan Hu, Weixun Wang, Wengang Zhou, Jianye Hao, Jiangcheng Zhu, Houqiang Li

In cooperative multi-agent systems, agents jointly take actions and receive a team reward instead of individual rewards.

Debiased Recommendation with User Feature Balancing

no code implementations16 Jan 2022 Mengyue Yang, Guohao Cai, Furui Liu, Zhenhua Dong, Xiuqiang He, Jianye Hao, Jun Wang, Xu Chen

To alleviate these problems, in this paper, we propose a novel debiased recommendation framework based on user feature balancing.

Causal Inference Recommendation Systems

A Survey on Interpretable Reinforcement Learning

no code implementations24 Dec 2021 Claire Glanois, Paul Weng, Matthieu Zimmer, Dong Li, Tianpei Yang, Jianye Hao, Wulong Liu

To that aim, we distinguish interpretability (as a property of a model) and explainability (as a post-hoc operation, with the intervention of a proxy) and discuss them in the context of RL with an emphasis on the former notion.

Autonomous Driving Decision Making +1

ED2: An Environment Dynamics Decomposition Framework for World Model Construction

1 code implementation6 Dec 2021 Cong Wang, Tianpei Yang, Jianye Hao, Yan Zheng, Hongyao Tang, Fazl Barez, Jinyi Liu, Jiajie Peng, Haiyin Piao, Zhixiao Sun

To reduce the model error, previous works use a single well-designed network to fit the entire environment dynamics, which treats the environment dynamics as a black box.

Model-based Reinforcement Learning

A Hierarchical Reinforcement Learning Based Optimization Framework for Large-scale Dynamic Pickup and Delivery Problems

no code implementations NeurIPS 2021 Yi Ma, Xiaotian Hao, Jianye Hao, Jiawen Lu, Xing Liu, Tong Xialiang, Mingxuan Yuan, Zhigang Li, Jie Tang, Zhaopeng Meng

To address this problem, existing methods partition the overall DPDP into fixed-size sub-problems by caching online generated orders and solve each sub-problem, or on this basis to utilize the predicted future orders to optimize each sub-problem further.

Hierarchical Reinforcement Learning reinforcement-learning

Learning State Representations via Retracing in Reinforcement Learning

no code implementations ICLR 2022 Changmin Yu, Dong Li, Jianye Hao, Jun Wang, Neil Burgess

We propose learning via retracing, a novel self-supervised approach for learning the state representation (and the associated dynamics model) for reinforcement learning tasks.

Continuous Control Model-based Reinforcement Learning +2

Uncertainty-aware Low-Rank Q-Matrix Estimation for Deep Reinforcement Learning

no code implementations19 Nov 2021 Tong Sang, Hongyao Tang, Jianye Hao, Yan Zheng, Zhaopeng Meng

Such a reconstruction exploits the underlying structure of value matrix to improve the value approximation, thus leading to a more efficient learning process of value function.

Continuous Control reinforcement-learning

Lifelong Reinforcement Learning with Temporal Logic Formulas and Reward Machines

no code implementations18 Nov 2021 Xuejing Zheng, Chao Yu, Chen Chen, Jianye Hao, Hankz Hankui Zhuo

In this paper, we propose Lifelong reinforcement learning with Sequential linear temporal logic formulas and Reward Machines (LSRM), which enables an agent to leverage previously learned knowledge to fasten learning of logically specified tasks.

reinforcement-learning Transfer Learning

SEIHAI: A Sample-efficient Hierarchical AI for the MineRL Competition

no code implementations17 Nov 2021 Hangyu Mao, Chao Wang, Xiaotian Hao, Yihuan Mao, Yiming Lu, Chengjie WU, Jianye Hao, Dong Li, Pingzhong Tang

The MineRL competition is designed for the development of reinforcement learning and imitation learning algorithms that can efficiently leverage human demonstrations to drastically reduce the number of environment interactions needed to solve the complex \emph{ObtainDiamond} task with sparse rewards.

Imitation Learning reinforcement-learning

Dynamic Bottleneck for Robust Self-Supervised Exploration

1 code implementation NeurIPS 2021 Chenjia Bai, Lingxiao Wang, Lei Han, Animesh Garg, Jianye Hao, Peng Liu, Zhaoran Wang

Exploration methods based on pseudo-count of transitions or curiosity of dynamics have achieved promising results in solving reinforcement learning with sparse rewards.


Flattening Sharpness for Dynamic Gradient Projection Memory Benefits Continual Learning

1 code implementation NeurIPS 2021 Danruo Deng, Guangyong Chen, Jianye Hao, Qiong Wang, Pheng-Ann Heng

The backpropagation networks are notably susceptible to catastrophic forgetting, where networks tend to forget previously learned skills upon learning new ones.

Continual Learning

Ranking Cost: Building An Efficient and Scalable Circuit Routing Planner with Evolution-Based Optimization

1 code implementation8 Oct 2021 Shiyu Huang, Bin Wang, Dong Li, Jianye Hao, Ting Chen, Jun Zhu

In this work, we propose a new algorithm for circuit routing, named Ranking Cost, which innovatively combines search-based methods (i. e., A* algorithm) and learning-based methods (i. e., Evolution Strategies) to form an efficient and trainable router.

Online Ad Hoc Teamwork under Partial Observability

no code implementations ICLR 2022 Pengjie Gu, Mengchen Zhao, Jianye Hao, Bo An

Autonomous agents often need to work together as a team to accomplish complex cooperative tasks.

OVD-Explorer: A General Information-theoretic Exploration Approach for Reinforcement Learning

no code implementations29 Sep 2021 Jinyi Liu, Zhi Wang, Yan Zheng, Jianye Hao, Junjie Ye, Chenjia Bai, Pengyi Li

Many exploration strategies are built upon the optimism in the face of the uncertainty (OFU) principle for reinforcement learning.


Learning Explicit Credit Assignment for Multi-agent Joint Q-learning

no code implementations29 Sep 2021 Hangyu Mao, Jianye Hao, Dong Li, Jun Wang, Weixun Wang, Xiaotian Hao, Bin Wang, Kun Shao, Zhen Xiao, Wulong Liu

In contrast, we formulate an \emph{explicit} credit assignment problem where each agent gives its suggestion about how to weight individual Q-values to explicitly maximize the joint Q-value, besides guaranteeing the Bellman optimality of the joint Q-value.


Informative Robust Causal Representation for Generalizable Deep Learning

no code implementations29 Sep 2021 Mengyue Yang, Furui Liu, Xu Chen, Zhitang Chen, Jianye Hao, Jun Wang

In many real-world scenarios, such as image classification and recommender systems, it is evidence that representation learning can improve model's performance over multiple downstream tasks.

Image Classification Recommendation Systems +1

Learning Pseudometric-based Action Representations for Offline Reinforcement Learning

no code implementations29 Sep 2021 Pengjie Gu, Mengchen Zhao, Chen Chen, Dong Li, Jianye Hao, Bo An

Offline reinforcement learning is a promising approach for practical applications since it does not require interactions with real-world environments.

Offline RL Recommendation Systems +2

Exploration in Deep Reinforcement Learning: A Comprehensive Survey

no code implementations14 Sep 2021 Tianpei Yang, Hongyao Tang, Chenjia Bai, Jinyi Liu, Jianye Hao, Zhaopeng Meng, Peng Liu, Zhen Wang

In this paper, we conduct a comprehensive survey on existing exploration methods in DRL and deep MARL for the purpose of providing understandings and insights on the critical problems and solutions.

Autonomous Vehicles Efficient Exploration +2

CMML: Contextual Modulation Meta Learning for Cold-Start Recommendation

1 code implementation24 Aug 2021 Xidong Feng, Chen Chen, Dong Li, Mengchen Zhao, Jianye Hao, Jun Wang

Meta learning, especially gradient based one, can be adopted to tackle this problem by learning initial parameters of the model and thus allowing fast adaptation to a specific task from limited data examples.

Meta-Learning Recommendation Systems

Modeling Scale-free Graphs with Hyperbolic Geometry for Knowledge-aware Recommendation

no code implementations14 Aug 2021 Yankai Chen, Menglin Yang, Yingxue Zhang, Mengchen Zhao, Ziqiao Meng, Jianye Hao, Irwin King

Aiming to alleviate data sparsity and cold-start problems of traditional recommender systems, incorporating knowledge graphs (KGs) to supplement auxiliary information has recently gained considerable attention.

Knowledge-Aware Recommendation Knowledge Graphs

Contrastive ACE: Domain Generalization Through Alignment of Causal Mechanisms

no code implementations2 Jun 2021 Yunqi Wang, Furui Liu, Zhitang Chen, Qing Lian, Shoubo Hu, Jianye Hao, Yik-Chung Wu

Domain generalization aims to learn knowledge invariant across different distributions while semantically meaningful for downstream tasks from multiple source domains, to improve the model's generalization ability on unseen target domains.

Domain Generalization

Cooperative Multi-Agent Transfer Learning with Level-Adaptive Credit Assignment

no code implementations1 Jun 2021 Tianze Zhou, Fubiao Zhang, Kun Shao, Kai Li, Wenhan Huang, Jun Luo, Weixun Wang, Yaodong Yang, Hangyu Mao, Bin Wang, Dong Li, Wulong Liu, Jianye Hao

In addition, we use a novel agent network named Population Invariant agent with Transformer (PIT) to realize the coordination transfer in more varieties of scenarios.

Multi-agent Reinforcement Learning Starcraft +2

Learning to Select Cuts for Efficient Mixed-Integer Programming

no code implementations28 May 2021 Zeren Huang, Kerong Wang, Furui Liu, Hui-Ling Zhen, Weinan Zhang, Mingxuan Yuan, Jianye Hao, Yong Yu, Jun Wang

In the online A/B testing of the product planning problems with more than $10^7$ variables and constraints daily, Cut Ranking has achieved the average speedup ratio of 12. 42% over the production solver without any accuracy loss of solution.

Multiple Instance Learning

Ordering-Based Causal Discovery with Reinforcement Learning

1 code implementation14 May 2021 Xiaoqiang Wang, Yali Du, Shengyu Zhu, Liangjun Ke, Zhitang Chen, Jianye Hao, Jun Wang

It is a long-standing question to discover causal relations among a set of variables in many empirical sciences.

Causal Discovery reinforcement-learning +1

Principled Exploration via Optimistic Bootstrapping and Backward Induction

1 code implementation13 May 2021 Chenjia Bai, Lingxiao Wang, Lei Han, Jianye Hao, Animesh Garg, Peng Liu, Zhaoran Wang

In this paper, we propose a principled exploration method for DRL through Optimistic Bootstrapping and Backward Induction (OB2I).

Efficient Exploration

An Adversarial Imitation Click Model for Information Retrieval

1 code implementation13 Apr 2021 Xinyi Dai, Jianghao Lin, Weinan Zhang, Shuai Li, Weiwen Liu, Ruiming Tang, Xiuqiang He, Jianye Hao, Jun Wang, Yong Yu

Modern information retrieval systems, including web search, ads placement, and recommender systems, typically rely on learning from user feedback.

Imitation Learning Information Retrieval +1

Learning Symbolic Rules for Interpretable Deep Reinforcement Learning

no code implementations15 Mar 2021 Zhihao Ma, Yuzheng Zhuang, Paul Weng, Hankz Hankui Zhuo, Dong Li, Wulong Liu, Jianye Hao

To address this challenge and improve the transparency, we propose a Neural Symbolic Reinforcement Learning framework by introducing symbolic logic into DRL.


Addressing Action Oscillations through Learning Policy Inertia

no code implementations3 Mar 2021 Chen Chen, Hongyao Tang, Jianye Hao, Wulong Liu, Zhaopeng Meng

We propose Nested Policy Iteration as a general training algorithm for PIC-augmented policy which ensures monotonically non-decreasing updates under some mild conditions.

Atari Games Autonomous Driving +1

Approximating Pareto Frontier through Bayesian-optimization-directed Robust Multi-objective Reinforcement Learning

no code implementations1 Jan 2021 Xiangkun He, Jianye Hao, Dong Li, Bin Wang, Wulong Liu

Thirdly, the agent’s learning process is regarded as a black-box, and the comprehensive metric we proposed is computed after each episode of training, then a Bayesian optimization (BO) algorithm is adopted to guide the agent to evolve towards improving the quality of the approximated Pareto frontier.


Ranking Cost: One-Stage Circuit Routing by Directly Optimizing Global Objective Function

no code implementations1 Jan 2021 Shiyu Huang, Bin Wang, Dong Li, Jianye Hao, Jun Zhu, Ting Chen

In our method, we introduce a new set of variables called cost maps, which can help the A* router to find out proper paths to achieve the global object.

Optimistic Exploration with Backward Bootstrapped Bonus for Deep Reinforcement Learning

no code implementations1 Jan 2021 Chenjia Bai, Lingxiao Wang, Peng Liu, Zhaoran Wang, Jianye Hao, Yingnan Zhao

However, such an approach is challenging in developing practical exploration algorithms for Deep Reinforcement Learning (DRL).

Atari Games Efficient Exploration +2

Robust Memory Augmentation by Constrained Latent Imagination

no code implementations1 Jan 2021 Yao Mu, Yuzheng Zhuang, Bin Wang, Wulong Liu, Shengbo Eben Li, Jianye Hao

The latent dynamics model summarizes an agent’s high dimensional experiences in a compact way.

Robust Multi-Agent Reinforcement Learning Driven by Correlated Equilibrium

no code implementations1 Jan 2021 Yizheng Hu, Kun Shao, Dong Li, Jianye Hao, Wulong Liu, Yaodong Yang, Jun Wang, Zhanxing Zhu

Therefore, to achieve robust CMARL, we introduce novel strategies to encourage agents to learn correlated equilibrium while maximally preserving the convenience of the decentralized execution.

Adversarial Robustness reinforcement-learning +1

MQES: Max-Q Entropy Search for Efficient Exploration in Continuous Reinforcement Learning

no code implementations1 Jan 2021 Jinyi Liu, Zhi Wang, Jianye Hao, Yan Zheng

Recently, the principle of optimism in the face of (aleatoric and epistemic) uncertainty has been utilized to design efficient exploration strategies for Reinforcement Learning (RL).

Efficient Exploration reinforcement-learning

What About Inputing Policy in Value Function: Policy Representation and Policy-extended Value Function Approximator

no code implementations NeurIPS 2021 Hongyao Tang, Zhaopeng Meng, Jianye Hao, Chen Chen, Daniel Graves, Dong Li, Changmin Yu, Hangyu Mao, Wulong Liu, Yaodong Yang, Wenyuan Tao, Li Wang

We study Policy-extended Value Function Approximator (PeVFA) in Reinforcement Learning (RL), which extends conventional value function approximator (VFA) to take as input not only the state (and action) but also an explicit policy representation.

Continuous Control Contrastive Learning +2

Event-Triggered Multi-agent Reinforcement Learning with Communication under Limited-bandwidth Constraint

no code implementations10 Oct 2020 Guangzheng Hu, Yuanheng Zhu, Dongbin Zhao, Mengchen Zhao, Jianye Hao

Then the design of the event-triggered strategy is formulated as a constrained Markov decision problem, and reinforcement learning finds the best communication protocol that satisfies the limited bandwidth constraint.

Multiagent Systems

Transfer among Agents: An Efficient Multiagent Transfer Learning Framework

no code implementations28 Sep 2020 Tianpei Yang, Jianye Hao, Weixun Wang, Hongyao Tang, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Yujing Hu, Yingfeng Chen, Changjie Fan

In many cases, each agent's experience is inconsistent with each other which causes the option-value estimation to oscillate and to become inaccurate.

Transfer Learning

Dynamic Horizon Value Estimation for Model-based Reinforcement Learning

no code implementations21 Sep 2020 Jun-Jie Wang, Qichao Zhang, Dongbin Zhao, Mengchen Zhao, Jianye Hao

Existing model-based value expansion methods typically leverage a world model for value estimation with a fixed rollout horizon to assist policy learning.

Model-based Reinforcement Learning reinforcement-learning

Triple-GAIL: A Multi-Modal Imitation Learning Framework with Generative Adversarial Nets

no code implementations19 May 2020 Cong Fei, Bin Wang, Yuzheng Zhuang, Zongzhang Zhang, Jianye Hao, Hongbo Zhang, Xuewu Ji, Wulong Liu

Generative adversarial imitation learning (GAIL) has shown promising results by taking advantage of generative adversarial nets, especially in the field of robot learning.

Autonomous Vehicles Data Augmentation +1

Continuous Multiagent Control using Collective Behavior Entropy for Large-Scale Home Energy Management

no code implementations14 May 2020 Jianwen Sun, Yan Zheng, Jianye Hao, Zhaopeng Meng, Yang Liu

With the increasing popularity of electric vehicles, distributed energy generation and storage facilities in smart grid systems, an efficient Demand-Side Management (DSM) is urgent for energy savings and peak loads reduction.

Learning to Accelerate Heuristic Searching for Large-Scale Maximum Weighted b-Matching Problems in Online Advertising

no code implementations9 May 2020 Xiaotian Hao, Junqi Jin, Jianye Hao, Jin Li, Weixun Wang, Yi Ma, Zhenzhe Zheng, Han Li, Jian Xu, Kun Gai

Bipartite b-matching is fundamental in algorithm design, and has been widely applied into economic markets, labor markets, etc.

CausalVAE: Disentangled Representation Learning via Neural Structural Causal Models

1 code implementation CVPR 2021 Mengyue Yang, Furui Liu, Zhitang Chen, Xinwei Shen, Jianye Hao, Jun Wang

Learning disentanglement aims at finding a low dimensional representation which consists of multiple explanatory and generative factors of the observational data.


Efficient Deep Reinforcement Learning via Adaptive Policy Transfer

no code implementations19 Feb 2020 Tianpei Yang, Jianye Hao, Zhaopeng Meng, Zongzhang Zhang, Yujing Hu, Yingfeng Cheng, Changjie Fan, Weixun Wang, Wulong Liu, Zhaodong Wang, Jiajie Peng

Transfer Learning (TL) has shown great potential to accelerate Reinforcement Learning (RL) by leveraging prior knowledge from past learned policies of relevant tasks.

reinforcement-learning Transfer Learning

KoGuN: Accelerating Deep Reinforcement Learning via Integrating Human Suboptimal Knowledge

no code implementations18 Feb 2020 Peng Zhang, Jianye Hao, Weixun Wang, Hongyao Tang, Yi Ma, Yihai Duan, Yan Zheng

Our framework consists of a fuzzy rule controller to represent human knowledge and a refine module to fine-tune suboptimal prior knowledge.

Common Sense Reasoning Continuous Control +1

Neighborhood Cognition Consistent Multi-Agent Reinforcement Learning

no code implementations3 Dec 2019 Hangyu Mao, Wulong Liu, Jianye Hao, Jun Luo, Dong Li, Zhengchao Zhang, Jun Wang, Zhen Xiao

Social psychology and real experiences show that cognitive consistency plays an important role to keep human society in order: if people have a more consistent cognition about their environments, they are more likely to achieve better cooperation.

Multi-agent Reinforcement Learning Q-Learning +1

Multi-Agent Game Abstraction via Graph Attention Neural Network

no code implementations25 Nov 2019 Yong Liu, Weixun Wang, Yujing Hu, Jianye Hao, Xingguo Chen, Yang Gao

Traditional methods attempt to use pre-defined rules to capture the interaction relationship between agents.

Graph Attention Multi-agent Reinforcement Learning

There is Limited Correlation between Coverage and Robustness for Deep Neural Networks

no code implementations14 Nov 2019 Yizhen Dong, Peixin Zhang, Jingyi Wang, Shuang Liu, Jun Sun, Jianye Hao, Xinyu Wang, Li Wang, Jin Song Dong, Dai Ting

In this work, we conduct an empirical study to evaluate the relationship between coverage, robustness and attack/defense metrics for DNN.

Face Recognition Malware Detection

MGHRL: Meta Goal-generation for Hierarchical Reinforcement Learning

no code implementations30 Sep 2019 Haotian Fu, Hongyao Tang, Jianye Hao, Wulong Liu, Chen Chen

Most meta reinforcement learning (meta-RL) methods learn to adapt to new tasks by directly optimizing the parameters of policies over primitive action space.

Hierarchical Reinforcement Learning Meta-Learning +2

Efficient meta reinforcement learning via meta goal generation

no code implementations25 Sep 2019 Haotian Fu, Hongyao Tang, Jianye Hao

Meta reinforcement learning (meta-RL) is able to accelerate the acquisition of new tasks by learning from past experience.

Meta-Learning Meta Reinforcement Learning +1

From Few to More: Large-scale Dynamic Multiagent Curriculum Learning

no code implementations6 Sep 2019 Weixun Wang, Tianpei Yang, Yong liu, Jianye Hao, Xiaotian Hao, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao

In this paper, we design a novel Dynamic Multiagent Curriculum Learning (DyMA-CL) to solve large-scale problems by starting from learning on a multiagent scenario with a small size and progressively increasing the number of agents.

Spectral-based Graph Convolutional Network for Directed Graphs

no code implementations21 Jul 2019 Yi Ma, Jianye Hao, Yaodong Yang, Han Li, Junqi Jin, Guangyong Chen

Our approach can work directly on directed graph data in semi-supervised nodes classification tasks.

Deep Multi-Agent Reinforcement Learning with Discrete-Continuous Hybrid Action Spaces

no code implementations12 Mar 2019 Haotian Fu, Hongyao Tang, Jianye Hao, Zihan Lei, Yingfeng Chen, Changjie Fan

Deep Reinforcement Learning (DRL) has been applied to address a variety of cooperative multi-agent problems with either discrete action spaces or continuous action spaces.

Multi-agent Reinforcement Learning Q-Learning +1

A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents

no code implementations NeurIPS 2018 Yan Zheng, Zhaopeng Meng, Jianye Hao, Zongzhang Zhang, Tianpei Yang, Changjie Fan

In multiagent domains, coping with non-stationary agents that change behaviors from time to time is a challenging problem, where an agent is usually required to be able to quickly detect the other agent's policy during online interaction, and then adapt its own policy accordingly.

Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction

no code implementations25 Sep 2018 Hongyao Tang, Jianye Hao, Tangjie Lv, Yingfeng Chen, Zongzhang Zhang, Hangtian Jia, Chunxu Ren, Yan Zheng, Zhaopeng Meng, Changjie Fan, Li Wang

Besides, we propose a new experience replay mechanism to alleviate the issue of the sparse transitions at the high level of abstraction and the non-stationarity of multiagent learning.


SCC-rFMQ Learning in Cooperative Markov Games with Continuous Actions

no code implementations18 Sep 2018 Chengwei Zhang, Xiaohong Li, Jianye Hao, Siqi Chen, Karl Tuyls, Zhiyong Feng, Wanli Xue, Rong Chen

Although many reinforcement learning methods have been proposed for learning the optimal solutions in single-agent continuous-action domains, multiagent coordination domains with continuous actions have received relatively few investigations.


Towards Efficient Detection and Optimal Response against Sophisticated Opponents

no code implementations12 Sep 2018 Tianpei Yang, Zhaopeng Meng, Jianye Hao, Chongjie Zhang, Yan Zheng, Ze Zheng

This paper proposes a novel approach called Bayes-ToMoP which can efficiently detect the strategy of opponents using either stationary or higher-level reasoning strategies.

Multiagent Systems

Learning Adaptive Display Exposure for Real-Time Advertising

no code implementations10 Sep 2018 Weixun Wang, Junqi Jin, Jianye Hao, Chunjie Chen, Chuan Yu, Wei-Nan Zhang, Jun Wang, Xiaotian Hao, Yixi Wang, Han Li, Jian Xu, Kun Gai

In this paper, we investigate the problem of advertising with adaptive exposure: can we dynamically determine the number and positions of ads for each user visit under certain business constraints so that the platform revenue can be increased?

An Optimal Rewiring Strategy for Reinforcement Social Learning in Cooperative Multiagent Systems

no code implementations13 May 2018 Hongyao Tang, Li Wang, Zan Wang, Tim Baarslag, Jianye Hao

Multiagent coordination in cooperative multiagent systems (MASs) has been widely studied in both fixed-agent repeated interaction setting and the static social learning framework.

Falsification of Cyber-Physical Systems Using Deep Reinforcement Learning

no code implementations1 May 2018 Takumi Akazaki, Shuang Liu, Yoriyuki Yamagata, Yihai Duan, Jianye Hao

With the rapid development of software and distributed computing, Cyber-Physical Systems (CPS) are widely adopted in many application areas, e. g., smart grid, autonomous automobile.

Distributed Computing reinforcement-learning

SA-IGA: A Multiagent Reinforcement Learning Method Towards Socially Optimal Outcomes

no code implementations8 Mar 2018 Chengwei Zhang, Xiaohong Li, Jianye Hao, Siqi Chen, Karl Tuyls, Wanli Xue

In multiagent environments, the capability of learning is important for an agent to behave appropriately in face of unknown opponents and dynamic environment.

Q-Learning reinforcement-learning

Blind Image Denoising via Dependent Dirichlet Process Tree

no code implementations13 Jan 2016 Fengyuan Zhu, Guangyong Chen, Jianye Hao, Pheng-Ann Heng

This paper addresses this problem and proposes a novel blind image denoising algorithm to recover the clean image from noisy one with the unknown noise model.

Image Denoising Variational Inference

Cannot find the paper you are looking for? You can Submit a new open access paper.