1 code implementation • 22 Aug 2023 • Zhihai Wang, Lei Chen, Jie Wang, Xing Li, Yinqi Bai, Xijun Li, Mingxuan Yuan, Jianye Hao, Yongdong Zhang, Feng Wu
In particular, we notice that the runtime of the Resub and Mfs2 operators often dominates the overall runtime of LS optimization processes.
no code implementations • 7 Aug 2023 • Taichi Liu, Chen Gao, Zhenyu Wang, Dong Li, Jianye Hao, Depeng Jin, Yong Li
Graph Neural Network (GNN)-based models have become the mainstream approach for recommender systems.
1 code implementation • 1 Aug 2023 • Junyi Wang, Yuanyang Zhu, Zhi Wang, Yan Zheng, Jianye Hao, Chunlin Chen
Evolutionary reinforcement learning (ERL) algorithms recently raise attention in tackling complex reinforcement learning (RL) problems due to high parallelism, while they are prone to insufficient exploration or model collapse without carefully tuning hyperparameters (aka meta-parameters).
no code implementations • 14 Jul 2023 • Fei Zhang, Yunjie Ye, Lei Feng, Zhongwen Rao, Jieming Zhu, Marcus Kalander, Chen Gong, Jianye Hao, Bo Han
In this setting, an oracle annotates the query samples with partial labels, relaxing the oracle from the demanding accurate labeling process.
no code implementations • 3 Jul 2023 • Yueen Ma, Dafeng Chi, Jingjing Li, Yuzheng Zhuang, Jianye Hao, Irwin King
Previous question-answer pair generation methods aimed to produce fluent and meaningful question-answer pairs but tend to have poor diversity.
no code implementations • 27 Jun 2023 • Jinyi Liu, Yi Ma, Jianye Hao, Yujing Hu, Yan Zheng, Tangjie Lv, Changjie Fan
In summary, our research emphasizes the significance of trajectory-based data sampling techniques in enhancing the efficiency and performance of offline RL algorithms.
no code implementations • 26 Jun 2023 • Yao Lai, Jinxin Liu, Zhentao Tang, Bin Wang, Jianye Hao, Ping Luo
To resolve these challenges, we cast the chip placement as an offline RL formulation and present ChiPFormer that enables learning a transferable placement policy from fixed offline data.
no code implementations • 14 Jun 2023 • Xuechen Mu, Hankz Hankui Zhuo, Chen Chen, Kai Zhang, Chao Yu, Jianye Hao
Exploring sparse reward multi-agent reinforcement learning (MARL) environments with traps in a collaborative manner is a complex task.
no code implementations • 12 Jun 2023 • Kai Zhao, Yi Ma, Jianye Hao, Jinyi Liu, Yan Zheng, Zhaopeng Meng
Offline reinforcement learning (RL) is a learning paradigm where an agent learns from a fixed dataset of experience.
no code implementations • 31 May 2023 • Fei Ni, Jianye Hao, Yao Mu, Yifu Yuan, Yan Zheng, Bin Wang, Zhixuan Liang
Recently, diffusion model shines as a promising backbone for the sequence modeling paradigm in offline reinforcement learning(RL).
no code implementations • 11 May 2023 • Yinchuan Li, Shuang Luo, Yunfeng Shao, Jianye Hao
We propose the GFlowNets with Human Feedback (GFlowHF) framework to improve the exploration ability when training AI models.
no code implementations • 9 May 2023 • Jiajun Fan, Yuzheng Zhuang, Yuecheng Liu, Jianye Hao, Bin Wang, Jiangcheng Zhu, Hao Wang, Shu-Tao Xia
The exploration problem is one of the main challenges in deep reinforcement learning (RL).
no code implementations • 8 May 2023 • Didi Zhu, Yinchuan Li, Yunfeng Shao, Jianye Hao, Fei Wu, Kun Kuang, Jun Xiao, Chao Wu
We introduce a new problem in unsupervised domain adaptation, termed as Generalized Universal Domain Adaptation (GUDA), which aims to achieve precise prediction of all target labels including unknown categories.
no code implementations • 2 May 2023 • Yuening Wang, Yingxue Zhang, Antonios Valkanas, Ruiming Tang, Chen Ma, Jianye Hao, Mark Coates
In contrast, for users who have static preferences, model performance can benefit greatly from preserving as much of the user's long-term preferences as possible.
no code implementations • 24 Apr 2023 • Yinchuan Li, Zhigang Li, Wenqian Li, Yunfeng Shao, Yan Zheng, Jianye Hao
Many score-based active learning methods have been successfully applied to graph-structured data, aiming to reduce the number of labels and achieve better performance of graph neural networks based on predefined score functions.
no code implementations • 12 Apr 2023 • Haozhi Wang, Yinchuan Li, Qing Wang, Yunfeng Shao, Jianye Hao
We then define an adjacency space for mismatched states and design a plug-and-play module for value iteration, which enables agents to infer more precise returns.
no code implementations • 12 Mar 2023 • Hao Chen, Jiaze Wang, Kun Shao, Furui Liu, Jianye Hao, Chenyong Guan, Guangyong Chen, Pheng-Ann Heng
Specifically, our Traj-MAE employs diverse masking strategies to pre-train the trajectory encoder and map encoder, allowing for the capture of social and temporal information among agents while leveraging the effect of environment from multiple granularities.
1 code implementation • 9 Mar 2023 • Qizhou Wang, Junjie Ye, Feng Liu, Quanyu Dai, Marcus Kalander, Tongliang Liu, Jianye Hao, Bo Han
It leads to a min-max learning scheme -- searching to synthesize OOD data that leads to worst judgments and learning from such OOD data for uniform performance in OOD detection.
Out-of-Distribution Detection
Out of Distribution (OOD) Detection
no code implementations • 6 Mar 2023 • Bowen Wang, Chen Liang, Jiaze Wang, Furui Liu, Shaogang Hao, Dong Li, Jianye Hao, Guangyong Chen, Xiaolong Zou, Pheng-Ann Heng
Reversely, the model Reconstructs a more robust equilibrium state prediction by transforming edge-level predictions to node-level with a sphere-fitting algorithm.
Initial Structure to Relaxed Energy (IS2RE), Direct
Property Prediction
no code implementations • 4 Mar 2023 • Yinchuan Li, Shuang Luo, Haozhi Wang, Jianye Hao
Generative flow networks (GFlowNets), as an emerging technique, can be used as an alternative to reinforcement learning for exploratory control tasks.
1 code implementation • 4 Mar 2023 • Wenqian Li, Yinchuan Li, Zhigang Li, Jianye Hao, Yan Pang
Uncovering rationales behind predictions of graph neural networks (GNNs) has received increasing attention over the years.
no code implementations • 2 Mar 2023 • Hongyao Tang, Min Zhang, Jianye Hao
On typical MuJoCo and DeepMind Control Suite (DMC) benchmarks, we find common phenomena for TD3 and RAD agents: 1) the activity of policy network parameters is highly asymmetric and policy networks advance monotonically along very few major parameter directions; 2) severe detours occur in parameter update and harmonic-like changes are observed for all minor parameter directions.
no code implementations • 6 Feb 2023 • Amur Ghose, Yingxue Zhang, Jianye Hao, Mark Coates
Contrastive learning has emerged as a premier method for learning representations with or without supervision.
no code implementations • 30 Jan 2023 • Junlong Lyu, Zhitang Chen, Wenlong Lyu, Jianye Hao
We proposed a new technique to accelerate sampling methods for solving difficult optimization problems.
1 code implementation • 20 Jan 2023 • Zifan Wu, Chao Yu, Chen Chen, Jianye Hao, Hankz Hankui Zhuo
In Model-based Reinforcement Learning (MBRL), model learning is critical since an inaccurate model can bias policy learning via generating misleading samples.
no code implementations • CVPR 2023 • Mingyang Sun, Mengchen Zhao, Yaqing Hou, Minglei Li, Huang Xu, Songcen Xu, Jianye Hao
There is a growing demand of automatically synthesizing co-speech gestures for virtual characters.
1 code implementation • 30 Dec 2022 • Hangyu Mao, Rui Zhao, Hao Chen, Jianye Hao, Yiqun Chen, Dong Li, Junge Zhang, Zhen Xiao
Recent methods combine the Transformer with these modules for better performance.
no code implementations • 29 Dec 2022 • Mehrtash Mehrabi, Walid Masoudimansour, Yingxue Zhang, Jie Chuai, Zhitang Chen, Mark Coates, Jianye Hao, Yanhui Geng
This performance relies heavily on the configuration of the network parameters.
no code implementations • 18 Dec 2022 • Minghuan Liu, Zhengbang Zhu, Menghui Zhu, Yuzheng Zhuang, Weinan Zhang, Jianye Hao
In reinforcement learning applications like robotics, agents usually need to deal with various input/output features when specified with different state/action spaces by their developers or physical restrictions.
no code implementations • 28 Nov 2022 • Chen Chen, Hongyao Tang, Yi Ma, Chao Wang, Qianli Shen, Dong Li, Jianye Hao
The key idea of SA-PP is leveraging discounted stationary state distribution ratios between the learning policy and the offline dataset to modulate the degree of behavior regularization in a state-wise manner, so that pessimism can be implemented in a more appropriate way.
no code implementations • 23 Nov 2022 • Junjie Wang, Yao Mu, Dong Li, Qichao Zhang, Dongbin Zhao, Yuzheng Zhuang, Ping Luo, Bin Wang, Jianye Hao
The latent world model provides a promising way to learn policies in a compact latent space for tasks with high-dimensional observations, however, its generalization across diverse environments with unseen dynamics remains challenging.
Model-based Reinforcement Learning
reinforcement-learning
+1
no code implementations • 7 Nov 2022 • Zhengbang Zhu, Shenyu Zhang, Yuzheng Zhuang, Yuecheng Liu, Minghuan Liu, Liyuan Mao, Ziqin Gong, Weinan Zhang, Shixiong Kai, Qiang Gu, Bin Wang, Siyuan Cheng, Xinyu Wang, Jianye Hao, Yong Yu
High-quality traffic flow generation is the core module in building simulators for autonomous driving.
1 code implementation • 26 Oct 2022 • Jianye Hao, Pengyi Li, Hongyao Tang, Yan Zheng, Xian Fu, Zhaopeng Meng
The state representation conveys expressive common features of the environment learned by all the agents collectively; the linear policy representation provides a favorable space for efficient policy optimization, where novel behavior-level crossover and mutation operations can be performed.
no code implementations • 17 Oct 2022 • Yiqun Chen, Hangyu Mao, Tianle Zhang, Shiguang Wu, Bin Zhang, Jianye Hao, Dong Li, Bin Wang, Hongxing Chang
Centralized Training with Decentralized Execution (CTDE) has been a very popular paradigm for multi-agent reinforcement learning.
1 code implementation • 15 Oct 2022 • Wenqian Li, Yinchuan Li, Shengyu Zhu, Yunfeng Shao, Jianye Hao, Yan Pang
Causal discovery aims to uncover causal structure among a set of variables.
1 code implementation • 9 Oct 2022 • Yao Mu, Yuzheng Zhuang, Fei Ni, Bin Wang, Jianyu Chen, Jianye Hao, Ping Luo
This paper addresses such a challenge by Decomposed Mutual INformation Optimization (DOMINO) for context learning, which explicitly learns a disentangled context to maximize the mutual information between the context and historical trajectories, while minimizing the state transition prediction error.
no code implementations • 2 Oct 2022 • Yifu Yuan, Jianye Hao, Fei Ni, Yao Mu, Yan Zheng, Yujing Hu, Jinyi Liu, Yingfeng Chen, Changjie Fan
Unsupervised reinforcement learning (URL) poses a promising paradigm to learn useful behaviors in a task-agnostic environment without the guidance of extrinsic rewards to facilitate the fast adaptation of various downstream tasks.
no code implementations • 21 Sep 2022 • Haozhi Wang, Qing Wang, Yunfeng Shao, Dong Li, Jianye Hao, Yinchuan Li
Modern meta-reinforcement learning (Meta-RL) methods are mainly developed based on model-agnostic meta-learning, which performs policy gradient steps across tasks to maximize policy performance.
1 code implementation • 18 Sep 2022 • GuanYu Lin, Chen Gao, Yinfeng Li, Yu Zheng, Zhiheng Li, Depeng Jin, Dong Li, Jianye Hao, Yong Li
Such user-centric recommendation will make it impossible for the provider to expose their new items, failing to consider the accordant interactions between user and item dimensions.
no code implementations • 16 Sep 2022 • Min Zhang, Hongyao Tang, Jianye Hao, Yan Zheng
First, we propose a unified policy abstraction theory, containing three types of policy abstraction associated to policy features at different levels.
no code implementations • 26 Jul 2022 • Zeren Huang, WenHao Chen, Weinan Zhang, Chuhan Shi, Furui Liu, Hui-Ling Zhen, Mingxuan Yuan, Jianye Hao, Yong Yu, Jun Wang
Deriving a good variable selection strategy in branch-and-bound is essential for the efficiency of modern mixed-integer programming (MIP) solvers.
1 code implementation • Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval 2021 • Jianghao Lin, Weiwen Liu, Xinyi Dai, Weinan Zhang, Shuai Li, Ruiming Tang, Xiuqiang He, Jianye Hao, Yong Yu
To better exploit search logs and model users' behavior patterns, numerous click models are proposed to extract users' implicit interaction feedback.
no code implementations • 27 May 2022 • Yushi Cao, Zhiming Li, Tianpei Yang, Hao Zhang, Yan Zheng, Yi Li, Jianye Hao, Yang Liu
In this paper, we combine the above two paradigms together and propose a novel Generalizable Logic Synthesis (GALOIS) framework to synthesize hierarchical and strict cause-effect logic programs.
no code implementations • 27 May 2022 • Wei Qiu, Weixun Wang, Rundong Wang, Bo An, Yujing Hu, Svetlana Obraztsova, Zinovi Rabinovich, Jianye Hao, Yingfeng Chen, Changjie Fan
During execution durations, the environment changes are influenced by, but not synchronised with, action execution.
Multi-agent Reinforcement Learning
reinforcement-learning
+3
1 code implementation • 19 Apr 2022 • Wei Chen, Zhiwei Li, Hongyi Fang, Qianyuan Yao, Cheng Zhong, Jianye Hao, Qi Zhang, Xuanjing Huang, Jiajie Peng, Zhongyu Wei
In recent years, interest has arisen in using machine learning to improve the efficiency of automatic medical consultation and enhance patient experience.
no code implementations • 6 Apr 2022 • Tong Sang, Hongyao Tang, Yi Ma, Jianye Hao, Yan Zheng, Zhaopeng Meng, Boyan Li, Zhen Wang
In online adaptation phase, with the environment context inferred from few experiences collected in new environments, the policy is optimized by gradient ascent with respect to the PDVF.
no code implementations • 24 Mar 2022 • Bowen Wang, Guibao Shen, Dong Li, Jianye Hao, Wulong Liu, Yu Huang, HongZhong Wu, Yibo Lin, Guangyong Chen, Pheng Ann Heng
Precise congestion prediction from a placement solution plays a crucial role in circuit placement.
1 code implementation • 16 Mar 2022 • Jian Zhao, Youpeng Zhao, Weixun Wang, Mingyu Yang, Xunhan Hu, Wengang Zhou, Jianye Hao, Houqiang Li
To the best of our knowledge, this work is the first to study the unexpected crashes in the multi-agent system.
Multi-agent Reinforcement Learning
reinforcement-learning
+3
1 code implementation • 16 Mar 2022 • Pengyi Li, Hongyao Tang, Tianpei Yang, Xiaotian Hao, Tong Sang, Yan Zheng, Jianye Hao, Matthew E. Taylor, Wenyuan Tao, Zhen Wang, Fazl Barez
However, we reveal sub-optimal collaborative behaviors also emerge with strong correlations, and simply maximizing the MI can, surprisingly, hinder the learning towards better collaboration.
Multi-agent Reinforcement Learning
reinforcement-learning
+1
no code implementations • 10 Mar 2022 • Xiaotian Hao, Hangyu Mao, Weixun Wang, Yaodong Yang, Dong Li, Yan Zheng, Zhen Wang, Jianye Hao
To break this curse, we propose a unified agent permutation framework that exploits the permutation invariance (PI) and permutation equivariance (PE) inductive biases to reduce the multiagent state space.
2 code implementations • 4 Mar 2022 • Minghuan Liu, Zhengbang Zhu, Yuzheng Zhuang, Weinan Zhang, Jianye Hao, Yong Yu, Jun Wang
Recent progress in state-only imitation learning extends the scope of applicability of imitation learning to real-world settings by relieving the need for observing expert actions.
no code implementations • 17 Feb 2022 • Mengyue Yang, Xinyu Cai, Furui Liu, Xu Chen, Zhitang Chen, Jianye Hao, Jun Wang
It is evidence that representation learning can improve model's performance over multiple downstream tasks in many real-world scenarios, such as image classification and recommender systems.
no code implementations • 9 Feb 2022 • Jian Zhao, Yue Zhang, Xunhan Hu, Weixun Wang, Wengang Zhou, Jianye Hao, Jiangcheng Zhu, Houqiang Li
In cooperative multi-agent systems, agents jointly take actions and receive a team reward instead of individual rewards.
no code implementations • 19 Jan 2022 • Jianye Hao, Jiawen Lu, Xijun Li, Xialiang Tong, Xiang Xiang, Mingxuan Yuan, Hankz Hankui Zhuo
The Dynamic Pickup and Delivery Problem (DPDP) is an essential problem within the logistics domain.
no code implementations • 16 Jan 2022 • Mengyue Yang, Guohao Cai, Furui Liu, Zhenhua Dong, Xiuqiang He, Jianye Hao, Jun Wang, Xu Chen
To alleviate these problems, in this paper, we propose a novel debiased recommendation framework based on user feature balancing.
no code implementations • 24 Dec 2021 • Claire Glanois, Paul Weng, Matthieu Zimmer, Dong Li, Tianpei Yang, Jianye Hao, Wulong Liu
To that aim, we distinguish interpretability (as a property of a model) and explainability (as a post-hoc operation, with the intervention of a proxy) and discuss them in the context of RL with an emphasis on the former notion.
1 code implementation • 6 Dec 2021 • Cong Wang, Tianpei Yang, Jianye Hao, Yan Zheng, Hongyao Tang, Fazl Barez, Jinyi Liu, Jiajie Peng, Haiyin Piao, Zhixiao Sun
To reduce the model error, previous works use a single well-designed network to fit the entire environment dynamics, which treats the environment dynamics as a black box.
1 code implementation • NeurIPS 2021 • Chenyang Wu, Guoyu Yang, Zongzhang Zhang, Yang Yu, Dong Li, Wulong Liu, Jianye Hao
A belief is a distribution of states representing state uncertainty.
no code implementations • NeurIPS 2021 • Yi Ma, Xiaotian Hao, Jianye Hao, Jiawen Lu, Xing Liu, Tong Xialiang, Mingxuan Yuan, Zhigang Li, Jie Tang, Zhaopeng Meng
To address this problem, existing methods partition the overall DPDP into fixed-size sub-problems by caching online generated orders and solve each sub-problem, or on this basis to utilize the predicted future orders to optimize each sub-problem further.
no code implementations • NeurIPS 2021 • Yao Mu, Yuzheng Zhuang, Bin Wang, Guangxiang Zhu, Wulong Liu, Jianyu Chen, Ping Luo, Shengbo Li, Chongjie Zhang, Jianye Hao
Model-based reinforcement learning aims to improve the sample efficiency of policy learning by modeling the dynamics of the environment.
Model-based Reinforcement Learning
reinforcement-learning
+1
1 code implementation • ICLR 2022 • Changmin Yu, Dong Li, Jianye Hao, Jun Wang, Neil Burgess
We propose learning via retracing, a novel self-supervised approach for learning the state representation (and the associated dynamics model) for reinforcement learning tasks.
no code implementations • 19 Nov 2021 • Tong Sang, Hongyao Tang, Jianye Hao, Yan Zheng, Zhaopeng Meng
Such a reconstruction exploits the underlying structure of value matrix to improve the value approximation, thus leading to a more efficient learning process of value function.
no code implementations • 18 Nov 2021 • Xuejing Zheng, Chao Yu, Chen Chen, Jianye Hao, Hankz Hankui Zhuo
In this paper, we propose Lifelong reinforcement learning with Sequential linear temporal logic formulas and Reward Machines (LSRM), which enables an agent to leverage previously learned knowledge to fasten learning of logically specified tasks.
no code implementations • 17 Nov 2021 • Hangyu Mao, Chao Wang, Xiaotian Hao, Yihuan Mao, Yiming Lu, Chengjie WU, Jianye Hao, Dong Li, Pingzhong Tang
The MineRL competition is designed for the development of reinforcement learning and imitation learning algorithms that can efficiently leverage human demonstrations to drastically reduce the number of environment interactions needed to solve the complex \emph{ObtainDiamond} task with sparse rewards.
1 code implementation • NeurIPS 2021 • Chenjia Bai, Lingxiao Wang, Lei Han, Animesh Garg, Jianye Hao, Peng Liu, Zhaoran Wang
Exploration methods based on pseudo-count of transitions or curiosity of dynamics have achieved promising results in solving reinforcement learning with sparse rewards.
1 code implementation • NeurIPS 2021 • Danruo Deng, Guangyong Chen, Jianye Hao, Qiong Wang, Pheng-Ann Heng
The backpropagation networks are notably susceptible to catastrophic forgetting, where networks tend to forget previously learned skills upon learning new ones.
1 code implementation • 8 Oct 2021 • Shiyu Huang, Bin Wang, Dong Li, Jianye Hao, Ting Chen, Jun Zhu
In this work, we propose a new algorithm for circuit routing, named Ranking Cost, which innovatively combines search-based methods (i. e., A* algorithm) and learning-based methods (i. e., Evolution Strategies) to form an efficient and trainable router.
no code implementations • 29 Sep 2021 • Jinyi Liu, Zhi Wang, Yan Zheng, Jianye Hao, Junjie Ye, Chenjia Bai, Pengyi Li
Many exploration strategies are built upon the optimism in the face of the uncertainty (OFU) principle for reinforcement learning.
no code implementations • 29 Sep 2021 • Pengjie Gu, Mengchen Zhao, Chen Chen, Dong Li, Jianye Hao, Bo An
Offline reinforcement learning is a promising approach for practical applications since it does not require interactions with real-world environments.
no code implementations • ICLR 2022 • Pengjie Gu, Mengchen Zhao, Jianye Hao, Bo An
Autonomous agents often need to work together as a team to accomplish complex cooperative tasks.
no code implementations • 29 Sep 2021 • Hangyu Mao, Jianye Hao, Dong Li, Jun Wang, Weixun Wang, Xiaotian Hao, Bin Wang, Kun Shao, Zhen Xiao, Wulong Liu
In contrast, we formulate an \emph{explicit} credit assignment problem where each agent gives its suggestion about how to weight individual Q-values to explicitly maximize the joint Q-value, besides guaranteeing the Bellman optimality of the joint Q-value.
no code implementations • 29 Sep 2021 • Mengyue Yang, Furui Liu, Xu Chen, Zhitang Chen, Jianye Hao, Jun Wang
In many real-world scenarios, such as image classification and recommender systems, it is evidence that representation learning can improve model's performance over multiple downstream tasks.
no code implementations • NeurIPS 2021 • Minghuan Liu, Zhengbang Zhu, Yuzheng Zhuang, Weinan Zhang, Jian Shen, Jianye Hao, Yong Yu, Jun Wang
State-only imitation learning (SOIL) enables agents to learn from massive demonstrations without explicit action or reward information.
no code implementations • 14 Sep 2021 • Jianye Hao, Tianpei Yang, Hongyao Tang, Chenjia Bai, Jinyi Liu, Zhaopeng Meng, Peng Liu, Zhen Wang
In addition to algorithmic analysis, we provide a comprehensive and unified empirical comparison of different exploration methods for DRL on a set of commonly used benchmarks.
1 code implementation • ICLR 2022 • Boyan Li, Hongyao Tang, Yan Zheng, Jianye Hao, Pengyi Li, Zhen Wang, Zhaopeng Meng, Li Wang
Discrete-continuous hybrid action space is a natural setting in many practical problems, such as robot control and game AI.
1 code implementation • 24 Aug 2021 • Xidong Feng, Chen Chen, Dong Li, Mengchen Zhao, Jianye Hao, Jun Wang
Meta learning, especially gradient based one, can be adopted to tackle this problem by learning initial parameters of the model and thus allowing fast adaptation to a specific task from limited data examples.
no code implementations • 14 Aug 2021 • Yankai Chen, Menglin Yang, Yingxue Zhang, Mengchen Zhao, Ziqiao Meng, Jianye Hao, Irwin King
Aiming to alleviate data sparsity and cold-start problems of traditional recommender systems, incorporating knowledge graphs (KGs) to supplement auxiliary information has recently gained considerable attention.
no code implementations • 2 Jun 2021 • Yunqi Wang, Furui Liu, Zhitang Chen, Qing Lian, Shoubo Hu, Jianye Hao, Yik-Chung Wu
Domain generalization aims to learn knowledge invariant across different distributions while semantically meaningful for downstream tasks from multiple source domains, to improve the model's generalization ability on unseen target domains.
no code implementations • 1 Jun 2021 • Tianze Zhou, Fubiao Zhang, Kun Shao, Kai Li, Wenhan Huang, Jun Luo, Weixun Wang, Yaodong Yang, Hangyu Mao, Bin Wang, Dong Li, Wulong Liu, Jianye Hao
In addition, we use a novel agent network named Population Invariant agent with Transformer (PIT) to realize the coordination transfer in more varieties of scenarios.
no code implementations • 28 May 2021 • Zeren Huang, Kerong Wang, Furui Liu, Hui-Ling Zhen, Weinan Zhang, Mingxuan Yuan, Jianye Hao, Yong Yu, Jun Wang
In the online A/B testing of the product planning problems with more than $10^7$ variables and constraints daily, Cut Ranking has achieved the average speedup ratio of 12. 42% over the production solver without any accuracy loss of solution.
1 code implementation • 14 May 2021 • Xiaoqiang Wang, Yali Du, Shengyu Zhu, Liangjun Ke, Zhitang Chen, Jianye Hao, Jun Wang
It is a long-standing question to discover causal relations among a set of variables in many empirical sciences.
1 code implementation • 13 May 2021 • Chenjia Bai, Lingxiao Wang, Lei Han, Jianye Hao, Animesh Garg, Peng Liu, Zhaoran Wang
In this paper, we propose a principled exploration method for DRL through Optimistic Bootstrapping and Backward Induction (OB2I).
1 code implementation • 13 Apr 2021 • Xinyi Dai, Jianghao Lin, Weinan Zhang, Shuai Li, Weiwen Liu, Ruiming Tang, Xiuqiang He, Jianye Hao, Jun Wang, Yong Yu
Modern information retrieval systems, including web search, ads placement, and recommender systems, typically rely on learning from user feedback.
no code implementations • 15 Mar 2021 • Zhihao Ma, Yuzheng Zhuang, Paul Weng, Hankz Hankui Zhuo, Dong Li, Wulong Liu, Jianye Hao
To address this challenge and improve the transparency, we propose a Neural Symbolic Reinforcement Learning framework by introducing symbolic logic into DRL.
no code implementations • ICLR Workshop SSL-RL 2021 • Changmin Yu, Dong Li, Hangyu Mao, Jianye Hao, Neil Burgess
Representation learning is a popular approach for reinforcement learning (RL) tasks with partially observable Markov decision processes.
1 code implementation • 3 Mar 2021 • Hongyao Tang, Jianye Hao, Guangyong Chen, Pengfei Chen, Chen Chen, Yaodong Yang, Luo Zhang, Wulong Liu, Zhaopeng Meng
Value function is the central notion of Reinforcement Learning (RL).
no code implementations • 3 Mar 2021 • Chen Chen, Hongyao Tang, Jianye Hao, Wulong Liu, Zhaopeng Meng
We propose Nested Policy Iteration as a general training algorithm for PIC-augmented policy which ensures monotonically non-decreasing updates under some mild conditions.
no code implementations • 23 Feb 2021 • Matthieu Zimmer, Xuening Feng, Claire Glanois, Zhaohui Jiang, Jianyi Zhang, Paul Weng, Dong Li, Jianye Hao, Wulong Liu
As a step in this direction, we propose a novel neural-logic architecture, called differentiable logic machine (DLM), that can solve both inductive logic programming (ILP) and reinforcement learning (RL) problems, where the solution can be interpreted as a first-order logic program.
no code implementations • 1 Jan 2021 • Chenjia Bai, Lingxiao Wang, Peng Liu, Zhaoran Wang, Jianye Hao, Yingnan Zhao
However, such an approach is challenging in developing practical exploration algorithms for Deep Reinforcement Learning (DRL).
no code implementations • 1 Jan 2021 • Jinyi Liu, Zhi Wang, Jianye Hao, Yan Zheng
Recently, the principle of optimism in the face of (aleatoric and epistemic) uncertainty has been utilized to design efficient exploration strategies for Reinforcement Learning (RL).
no code implementations • 1 Jan 2021 • Peng Zhang, Furui Liu, Zhitang Chen, Jianye Hao, Jun Wang
Reinforcement Learning (RL) has shown great potential to deal with sequential decision-making problems.
no code implementations • 1 Jan 2021 • Yizheng Hu, Kun Shao, Dong Li, Jianye Hao, Wulong Liu, Yaodong Yang, Jun Wang, Zhanxing Zhu
Therefore, to achieve robust CMARL, we introduce novel strategies to encourage agents to learn correlated equilibrium while maximally preserving the convenience of the decentralized execution.
no code implementations • 1 Jan 2021 • Zhihao Ma, Yuzheng Zhuang, Paul Weng, Dong Li, Kun Shao, Wulong Liu, Hankz Hankui Zhuo, Jianye Hao
Recent progress in deep reinforcement learning (DRL) can be largely attributed to the use of neural networks.
Hierarchical Reinforcement Learning
reinforcement-learning
+2
no code implementations • 1 Jan 2021 • Yao Mu, Yuzheng Zhuang, Bin Wang, Wulong Liu, Shengbo Eben Li, Jianye Hao
The latent dynamics model summarizes an agent’s high dimensional experiences in a compact way.
no code implementations • 1 Jan 2021 • Shiyu Huang, Bin Wang, Dong Li, Jianye Hao, Jun Zhu, Ting Chen
In our method, we introduce a new set of variables called cost maps, which can help the A* router to find out proper paths to achieve the global object.
no code implementations • 1 Jan 2021 • Xiangkun He, Jianye Hao, Dong Li, Bin Wang, Wulong Liu
Thirdly, the agent’s learning process is regarded as a black-box, and the comprehensive metric we proposed is computed after each episode of training, then a Bayesian optimization (BO) algorithm is adopted to guide the agent to evolve towards improving the quality of the approximated Pareto frontier.
Bayesian Optimization
Multi-Objective Reinforcement Learning
+1
no code implementations • 13 Nov 2020 • Jiajun Fan, He Ba, Xian Guo, Jianye Hao
Extensive experiments demonstrate that Critic PI2 achieved a new state of the art in a range of challenging continuous domains.
no code implementations • NeurIPS 2020 • Yujing Hu, Weixun Wang, Hangtian Jia, Yixiang Wang, Yingfeng Chen, Jianye Hao, Feng Wu, Changjie Fan
In this paper, we consider the problem of adaptively utilizing a given shaping reward function.
no code implementations • NeurIPS 2021 • Hongyao Tang, Zhaopeng Meng, Jianye Hao, Chen Chen, Daniel Graves, Dong Li, Changmin Yu, Hangyu Mao, Wulong Liu, Yaodong Yang, Wenyuan Tao, Li Wang
We study Policy-extended Value Function Approximator (PeVFA) in Reinforcement Learning (RL), which extends conventional value function approximator (VFA) to take as input not only the state (and action) but also an explicit policy representation.
3 code implementations • 19 Oct 2020 • Ming Zhou, Jun Luo, Julian Villella, Yaodong Yang, David Rusu, Jiayu Miao, Weinan Zhang, Montgomery Alban, Iman Fadakar, Zheng Chen, Aurora Chongxi Huang, Ying Wen, Kimia Hassanzadeh, Daniel Graves, Dong Chen, Zhengbang Zhu, Nhat Nguyen, Mohamed Elsayed, Kun Shao, Sanjeevan Ahilan, Baokuan Zhang, Jiannan Wu, Zhengang Fu, Kasra Rezaee, Peyman Yadmellat, Mohsen Rohani, Nicolas Perez Nieves, Yihan Ni, Seyedershad Banijamali, Alexander Cowen Rivers, Zheng Tian, Daniel Palenicek, Haitham Bou Ammar, Hongbo Zhang, Wulong Liu, Jianye Hao, Jun Wang
We open-source the SMARTS platform and the associated benchmark tasks and evaluation metrics to encourage and empower research on multi-agent learning for autonomous driving.
no code implementations • 10 Oct 2020 • Guangzheng Hu, Yuanheng Zhu, Dongbin Zhao, Mengchen Zhao, Jianye Hao
Then the design of the event-triggered strategy is formulated as a constrained Markov decision problem, and reinforcement learning finds the best communication protocol that satisfies the limited bandwidth constraint.
Multiagent Systems
1 code implementation • 29 Sep 2020 • Haotian Fu, Hongyao Tang, Jianye Hao, Chen Chen, Xidong Feng, Dong Li, Wulong Liu
How to collect informative trajectories of which the corresponding context reflects the specification of tasks?
no code implementations • 28 Sep 2020 • Hongyao Tang, Zhaopeng Meng, Jianye Hao, Chen Chen, Daniel Graves, Dong Li, Wulong Liu, Yaodong Yang
The value function lies in the heart of Reinforcement Learning (RL), which defines the long-term evaluation of a policy in a given state.
no code implementations • 28 Sep 2020 • Tianpei Yang, Jianye Hao, Weixun Wang, Hongyao Tang, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Yujing Hu, Yingfeng Chen, Changjie Fan
In many cases, each agent's experience is inconsistent with each other which causes the option-value estimation to oscillate and to become inaccurate.
Open-Ended Question Answering
Reinforcement Learning (RL)
+1
no code implementations • 21 Sep 2020 • Jun-Jie Wang, Qichao Zhang, Dongbin Zhao, Mengchen Zhao, Jianye Hao
Existing model-based value expansion methods typically leverage a world model for value estimation with a fixed rollout horizon to assist policy learning.
Model-based Reinforcement Learning
reinforcement-learning
+1
no code implementations • ICML 2020 • Xiaotian Hao, Zhaoqing Peng, Yi Ma, Guan Wang, Junqi Jin, Jianye Hao, Shan Chen, Rongquan Bai, Mingzhou Xie, Miao Xu, Zhenzhe Zheng, Chuan Yu, Han Li, Jian Xu, Kun Gai
In E-commerce, advertising is essential for merchants to reach their target users.
no code implementations • 19 May 2020 • Cong Fei, Bin Wang, Yuzheng Zhuang, Zongzhang Zhang, Jianye Hao, Hongbo Zhang, Xuewu Ji, Wulong Liu
Generative adversarial imitation learning (GAIL) has shown promising results by taking advantage of generative adversarial nets, especially in the field of robot learning.
no code implementations • 14 May 2020 • Jianwen Sun, Yan Zheng, Jianye Hao, Zhaopeng Meng, Yang Liu
With the increasing popularity of electric vehicles, distributed energy generation and storage facilities in smart grid systems, an efficient Demand-Side Management (DSM) is urgent for energy savings and peak loads reduction.
no code implementations • 9 May 2020 • Xiaotian Hao, Junqi Jin, Jianye Hao, Jin Li, Weixun Wang, Yi Ma, Zhenzhe Zheng, Han Li, Jian Xu, Kun Gai
Bipartite b-matching is fundamental in algorithm design, and has been widely applied into economic markets, labor markets, etc.
1 code implementation • CVPR 2021 • Mengyue Yang, Furui Liu, Zhitang Chen, Xinwei Shen, Jianye Hao, Jun Wang
Learning disentanglement aims at finding a low dimensional representation which consists of multiple explanatory and generative factors of the observational data.
no code implementations • 19 Feb 2020 • Tianpei Yang, Jianye Hao, Zhaopeng Meng, Zongzhang Zhang, Yujing Hu, Yingfeng Cheng, Changjie Fan, Weixun Wang, Wulong Liu, Zhaodong Wang, Jiajie Peng
Transfer Learning (TL) has shown great potential to accelerate Reinforcement Learning (RL) by leveraging prior knowledge from past learned policies of relevant tasks.
no code implementations • 18 Feb 2020 • Peng Zhang, Jianye Hao, Weixun Wang, Hongyao Tang, Yi Ma, Yihai Duan, Yan Zheng
Our framework consists of a fuzzy rule controller to represent human knowledge and a refine module to fine-tune suboptimal prior knowledge.
no code implementations • 3 Dec 2019 • Hangyu Mao, Wulong Liu, Jianye Hao, Jun Luo, Dong Li, Zhengchao Zhang, Jun Wang, Zhen Xiao
Social psychology and real experiences show that cognitive consistency plays an important role to keep human society in order: if people have a more consistent cognition about their environments, they are more likely to achieve better cooperation.
no code implementations • 25 Nov 2019 • Yong Liu, Weixun Wang, Yujing Hu, Jianye Hao, Xingguo Chen, Yang Gao
Traditional methods attempt to use pre-defined rules to capture the interaction relationship between agents.
no code implementations • 14 Nov 2019 • Yizhen Dong, Peixin Zhang, Jingyi Wang, Shuang Liu, Jun Sun, Jianye Hao, Xinyu Wang, Li Wang, Jin Song Dong, Dai Ting
In this work, we conduct an empirical study to evaluate the relationship between coverage, robustness and attack/defense metrics for DNN.
no code implementations • 30 Sep 2019 • Haotian Fu, Hongyao Tang, Jianye Hao, Wulong Liu, Chen Chen
Most meta reinforcement learning (meta-RL) methods learn to adapt to new tasks by directly optimizing the parameters of policies over primitive action space.
no code implementations • 25 Sep 2019 • Haotian Fu, Hongyao Tang, Jianye Hao
Meta reinforcement learning (meta-RL) is able to accelerate the acquisition of new tasks by learning from past experience.
no code implementations • 6 Sep 2019 • Weixun Wang, Tianpei Yang, Yong liu, Jianye Hao, Xiaotian Hao, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao
In this paper, we design a novel Dynamic Multiagent Curriculum Learning (DyMA-CL) to solve large-scale problems by starting from learning on a multiagent scenario with a small size and progressively increasing the number of agents.
1 code implementation • 5 Sep 2019 • Yu Chen, Yingfeng Chen, Zhipeng Hu, Tianpei Yang, Changjie Fan, Yang Yu, Jianye Hao
Transfer learning (TL) is a promising way to improve the sample efficiency of reinforcement learning.
1 code implementation • ICLR 2020 • Weixun Wang, Tianpei Yang, Yong liu, Jianye Hao, Xiaotian Hao, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao
ASN characterizes different actions' influence on other agents using neural networks based on the action semantics between them.
no code implementations • 21 Jul 2019 • Yi Ma, Jianye Hao, Yaodong Yang, Han Li, Junqi Jin, Guangyong Chen
Our approach can work directly on directed graph data in semi-supervised nodes classification tasks.
no code implementations • 27 May 2019 • Hongyao Tang, Jianye Hao, Guangyong Chen, Pengfei Chen, Zhaopeng Meng, Yaodong Yang, Li Wang
Value functions are crucial for model-free Reinforcement Learning (RL) to obtain a policy implicitly or guide the policy updates.
1 code implementation • 12 Mar 2019 • Haotian Fu, Hongyao Tang, Jianye Hao, Zihan Lei, Yingfeng Chen, Changjie Fan
Deep Reinforcement Learning (DRL) has been applied to address a variety of cooperative multi-agent problems with either discrete action spaces or continuous action spaces.
no code implementations • NeurIPS 2018 • Yan Zheng, Zhaopeng Meng, Jianye Hao, Zongzhang Zhang, Tianpei Yang, Changjie Fan
In multiagent domains, coping with non-stationary agents that change behaviors from time to time is a challenging problem, where an agent is usually required to be able to quickly detect the other agent's policy during online interaction, and then adapt its own policy accordingly.
no code implementations • 25 Sep 2018 • Hongyao Tang, Jianye Hao, Tangjie Lv, Yingfeng Chen, Zongzhang Zhang, Hangtian Jia, Chunxu Ren, Yan Zheng, Zhaopeng Meng, Changjie Fan, Li Wang
Besides, we propose a new experience replay mechanism to alleviate the issue of the sparse transitions at the high level of abstraction and the non-stationarity of multiagent learning.
no code implementations • 18 Sep 2018 • Chengwei Zhang, Xiaohong Li, Jianye Hao, Siqi Chen, Karl Tuyls, Zhiyong Feng, Wanli Xue, Rong Chen
Although many reinforcement learning methods have been proposed for learning the optimal solutions in single-agent continuous-action domains, multiagent coordination domains with continuous actions have received relatively few investigations.
no code implementations • 12 Sep 2018 • Tianpei Yang, Zhaopeng Meng, Jianye Hao, Chongjie Zhang, Yan Zheng, Ze Zheng
This paper proposes a novel approach called Bayes-ToMoP which can efficiently detect the strategy of opponents using either stationary or higher-level reasoning strategies.
Multiagent Systems
no code implementations • 10 Sep 2018 • Weixun Wang, Junqi Jin, Jianye Hao, Chunjie Chen, Chuan Yu, Wei-Nan Zhang, Jun Wang, Xiaotian Hao, Yixi Wang, Han Li, Jian Xu, Kun Gai
In this paper, we investigate the problem of advertising with adaptive exposure: can we dynamically determine the number and positions of ads for each user visit under certain business constraints so that the platform revenue can be increased?
no code implementations • 13 May 2018 • Hongyao Tang, Li Wang, Zan Wang, Tim Baarslag, Jianye Hao
Multiagent coordination in cooperative multiagent systems (MASs) has been widely studied in both fixed-agent repeated interaction setting and the static social learning framework.
no code implementations • 1 May 2018 • Takumi Akazaki, Shuang Liu, Yoriyuki Yamagata, Yihai Duan, Jianye Hao
With the rapid development of software and distributed computing, Cyber-Physical Systems (CPS) are widely adopted in many application areas, e. g., smart grid, autonomous automobile.
no code implementations • 8 Mar 2018 • Chengwei Zhang, Xiaohong Li, Jianye Hao, Siqi Chen, Karl Tuyls, Wanli Xue
In multiagent environments, the capability of learning is important for an agent to behave appropriately in face of unknown opponents and dynamic environment.
no code implementations • 1 Mar 2018 • Weixun Wang, Jianye Hao, Yixi Wang, Matthew Taylor
We introduce a Sequential Prisoner's Dilemma (SPD) game to better capture the aforementioned characteristics.
no code implementations • 23 Feb 2018 • Yan Zheng, Jianye Hao, Zongzhang Zhang
Recently, multiagent deep reinforcement learning (DRL) has received increasingly wide attention.
no code implementations • 13 Jan 2016 • Fengyuan Zhu, Guangyong Chen, Jianye Hao, Pheng-Ann Heng
This paper addresses this problem and proposes a novel blind image denoising algorithm to recover the clean image from noisy one with the unknown noise model.