no code implementations • 24 Mar 2023 • Vahid Partovi Nia, Guojun Zhang, Ivan Kobyzev, Michael R. Metel, Xinlin Li, Ke Sun, Sobhan Hemati, Masoud Asgharian, Linglong Kong, Wulong Liu, Boxing Chen
Deep models are dominating the artificial intelligence (AI) industry since the ImageNet challenge in 2012.
1 code implementation • 5 Oct 2022 • Meichen Liu, Lei Ding, Dengdeng Yu, Wulong Liu, Linglong Kong, Bei Jiang
To fulfill great needs and advocate the significance of quantile fairness, we propose a novel framework to learn a real-valued quantile function under the fairness requirement of Demographic Parity with respect to sensitive attributes, such as race or gender, and thereby derive a reliable fair prediction interval.
no code implementations • 24 Mar 2022 • Bowen Wang, Guibao Shen, Dong Li, Jianye Hao, Wulong Liu, Yu Huang, HongZhong Wu, Yibo Lin, Guangyong Chen, Pheng Ann Heng
Precise congestion prediction from a placement solution plays a crucial role in circuit placement.
no code implementations • 1 Feb 2022 • Ke Sun, Yingnan Zhao, Wulong Liu, Bei Jiang, Linglong Kong
The empirical success of distributional reinforcement learning~(RL) highly depends on the distribution representation and the choice of distribution divergence.
no code implementations • 26 Dec 2021 • Claire Glanois, Xuening Feng, Zhaohui Jiang, Paul Weng, Matthieu Zimmer, Dong Li, Wulong Liu
We propose an efficient interpretable neuro-symbolic model to solve Inductive Logic Programming (ILP) problems.
no code implementations • 24 Dec 2021 • Claire Glanois, Paul Weng, Matthieu Zimmer, Dong Li, Tianpei Yang, Jianye Hao, Wulong Liu
To that aim, we distinguish interpretability (as a property of a model) and explainability (as a post-hoc operation, with the intervention of a proxy) and discuss them in the context of RL with an emphasis on the former notion.
no code implementations • NeurIPS 2021 • Yao Mu, Yuzheng Zhuang, Bin Wang, Guangxiang Zhu, Wulong Liu, Jianyu Chen, Ping Luo, Shengbo Li, Chongjie Zhang, Jianye Hao
Model-based reinforcement learning aims to improve the sample efficiency of policy learning by modeling the dynamics of the environment.
Model-based Reinforcement Learning reinforcement-learning +1
1 code implementation • NeurIPS 2021 • Chenyang Wu, Guoyu Yang, Zongzhang Zhang, Yang Yu, Dong Li, Wulong Liu, Jianye Hao
A belief is a distribution of states representing state uncertainty.
no code implementations • 10 Nov 2021 • Amur Ghose, Vincent Zhang, Yingxue Zhang, Dong Li, Wulong Liu, Mark Coates
To address this limitation, we propose a framework that can directly learn embeddings for the given netlist to enhance the quality of our node features.
no code implementations • 29 Sep 2021 • Hangyu Mao, Jianye Hao, Dong Li, Jun Wang, Weixun Wang, Xiaotian Hao, Bin Wang, Kun Shao, Zhen Xiao, Wulong Liu
In contrast, we formulate an \emph{explicit} credit assignment problem where each agent gives its suggestion about how to weight individual Q-values to explicitly maximize the joint Q-value, besides guaranteeing the Bellman optimality of the joint Q-value.
1 code implementation • NeurIPS 2021 • Xinlin Li, Bang Liu, YaoLiang Yu, Wulong Liu, Chunjing Xu, Vahid Partovi Nia
Shift neural networks reduce computation complexity by removing expensive multiplication operations and quantizing continuous weights into low-bit discrete values, which are fast and energy efficient compared to conventional neural networks.
no code implementations • 1 Jun 2021 • Tianze Zhou, Fubiao Zhang, Kun Shao, Kai Li, Wenhan Huang, Jun Luo, Weixun Wang, Yaodong Yang, Hangyu Mao, Bin Wang, Dong Li, Wulong Liu, Jianye Hao
In addition, we use a novel agent network named Population Invariant agent with Transformer (PIT) to realize the coordination transfer in more varieties of scenarios.
no code implementations • NeurIPS 2021 • Xinlin Li, Bang Liu, YaoLiang Yu, Wulong Liu, Chunjing Xu, Vahid Partovi Nia
Shift neural networks reduce computation complexity by removing expensive multiplication operations and quantizing continuous weights into low-bit discrete values, which are fast and energy-efficient compared to conventional neural networks.
no code implementations • 15 Mar 2021 • Zhihao Ma, Yuzheng Zhuang, Paul Weng, Hankz Hankui Zhuo, Dong Li, Wulong Liu, Jianye Hao
To address this challenge and improve the transparency, we propose a Neural Symbolic Reinforcement Learning framework by introducing symbolic logic into DRL.
1 code implementation • 3 Mar 2021 • Hongyao Tang, Jianye Hao, Guangyong Chen, Pengfei Chen, Chen Chen, Yaodong Yang, Luo Zhang, Wulong Liu, Zhaopeng Meng
Value function is the central notion of Reinforcement Learning (RL).
no code implementations • 3 Mar 2021 • Chen Chen, Hongyao Tang, Jianye Hao, Wulong Liu, Zhaopeng Meng
We propose Nested Policy Iteration as a general training algorithm for PIC-augmented policy which ensures monotonically non-decreasing updates under some mild conditions.
no code implementations • 23 Feb 2021 • Matthieu Zimmer, Xuening Feng, Claire Glanois, Zhaohui Jiang, Jianyi Zhang, Paul Weng, Dong Li, Jianye Hao, Wulong Liu
As a step in this direction, we propose a novel neural-logic architecture, called differentiable logic machine (DLM), that can solve both inductive logic programming (ILP) and reinforcement learning (RL) problems, where the solution can be interpreted as a first-order logic program.
no code implementations • 1 Jan 2021 • Yizheng Hu, Kun Shao, Dong Li, Jianye Hao, Wulong Liu, Yaodong Yang, Jun Wang, Zhanxing Zhu
Therefore, to achieve robust CMARL, we introduce novel strategies to encourage agents to learn correlated equilibrium while maximally preserving the convenience of the decentralized execution.
no code implementations • 1 Jan 2021 • Yao Mu, Yuzheng Zhuang, Bin Wang, Wulong Liu, Shengbo Eben Li, Jianye Hao
The latent dynamics model summarizes an agent’s high dimensional experiences in a compact way.
no code implementations • 1 Jan 2021 • Xiangkun He, Jianye Hao, Dong Li, Bin Wang, Wulong Liu
Thirdly, the agent’s learning process is regarded as a black-box, and the comprehensive metric we proposed is computed after each episode of training, then a Bayesian optimization (BO) algorithm is adopted to guide the agent to evolve towards improving the quality of the approximated Pareto frontier.
Bayesian Optimization Multi-Objective Reinforcement Learning +1
no code implementations • 1 Jan 2021 • Zhihao Ma, Yuzheng Zhuang, Paul Weng, Dong Li, Kun Shao, Wulong Liu, Hankz Hankui Zhuo, Jianye Hao
Recent progress in deep reinforcement learning (DRL) can be largely attributed to the use of neural networks.
Hierarchical Reinforcement Learning reinforcement-learning +2
1 code implementation • NeurIPS 2021 • Hongyao Tang, Zhaopeng Meng, Jianye Hao, Chen Chen, Daniel Graves, Dong Li, Changmin Yu, Hangyu Mao, Wulong Liu, Yaodong Yang, Wenyuan Tao, Li Wang
We study Policy-extended Value Function Approximator (PeVFA) in Reinforcement Learning (RL), which extends conventional value function approximator (VFA) to take as input not only the state (and action) but also an explicit policy representation.
3 code implementations • 19 Oct 2020 • Ming Zhou, Jun Luo, Julian Villella, Yaodong Yang, David Rusu, Jiayu Miao, Weinan Zhang, Montgomery Alban, Iman Fadakar, Zheng Chen, Aurora Chongxi Huang, Ying Wen, Kimia Hassanzadeh, Daniel Graves, Dong Chen, Zhengbang Zhu, Nhat Nguyen, Mohamed Elsayed, Kun Shao, Sanjeevan Ahilan, Baokuan Zhang, Jiannan Wu, Zhengang Fu, Kasra Rezaee, Peyman Yadmellat, Mohsen Rohani, Nicolas Perez Nieves, Yihan Ni, Seyedershad Banijamali, Alexander Cowen Rivers, Zheng Tian, Daniel Palenicek, Haitham Bou Ammar, Hongbo Zhang, Wulong Liu, Jianye Hao, Jun Wang
We open-source the SMARTS platform and the associated benchmark tasks and evaluation metrics to encourage and empower research on multi-agent learning for autonomous driving.
1 code implementation • 29 Sep 2020 • Haotian Fu, Hongyao Tang, Jianye Hao, Chen Chen, Xidong Feng, Dong Li, Wulong Liu
How to collect informative trajectories of which the corresponding context reflects the specification of tasks?
no code implementations • 28 Sep 2020 • Tianpei Yang, Jianye Hao, Weixun Wang, Hongyao Tang, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Yujing Hu, Yingfeng Chen, Changjie Fan
In many cases, each agent's experience is inconsistent with each other which causes the option-value estimation to oscillate and to become inaccurate.
Open-Ended Question Answering Reinforcement Learning (RL) +1
no code implementations • 28 Sep 2020 • Hongyao Tang, Zhaopeng Meng, Jianye Hao, Chen Chen, Daniel Graves, Dong Li, Wulong Liu, Yaodong Yang
The value function lies in the heart of Reinforcement Learning (RL), which defines the long-term evaluation of a policy in a given state.
no code implementations • 19 May 2020 • Cong Fei, Bin Wang, Yuzheng Zhuang, Zongzhang Zhang, Jianye Hao, Hongbo Zhang, Xuewu Ji, Wulong Liu
Generative adversarial imitation learning (GAIL) has shown promising results by taking advantage of generative adversarial nets, especially in the field of robot learning.
1 code implementation • 19 Feb 2020 • Tianpei Yang, Jianye Hao, Zhaopeng Meng, Zongzhang Zhang, Yujing Hu, Yingfeng Cheng, Changjie Fan, Weixun Wang, Wulong Liu, Zhaodong Wang, Jiajie Peng
Transfer Learning (TL) has shown great potential to accelerate Reinforcement Learning (RL) by leveraging prior knowledge from past learned policies of relevant tasks.
1 code implementation • ICLR 2020 • Minghuan Liu, Ming Zhou, Wei-Nan Zhang, Yuzheng Zhuang, Jun Wang, Wulong Liu, Yong Yu
In this paper, we cast the multi-agent interactions modeling problem into a multi-agent imitation learning framework with explicit modeling of correlated policies by approximating opponents' policies, which can recover agents' policies that can regenerate similar interactions.
no code implementations • 3 Dec 2019 • Hangyu Mao, Wulong Liu, Jianye Hao, Jun Luo, Dong Li, Zhengchao Zhang, Jun Wang, Zhen Xiao
Social psychology and real experiences show that cognitive consistency plays an important role to keep human society in order: if people have a more consistent cognition about their environments, they are more likely to achieve better cooperation.
no code implementations • 30 Sep 2019 • Haotian Fu, Hongyao Tang, Jianye Hao, Wulong Liu, Chen Chen
Most meta reinforcement learning (meta-RL) methods learn to adapt to new tasks by directly optimizing the parameters of policies over primitive action space.
no code implementations • 25 Sep 2019 • Yuan Tian, Minghao Han, Lixian Zhang, Wulong Liu, Jun Wang, Wei Pan
In this paper, we combine variational learning and constrained reinforcement learning to simultaneously learn a Conditional Representation Model (CRM) to encode the states into safe and unsafe distributions respectively as well as to learn the corresponding safe policy.
no code implementations • 11 May 2019 • Dong Li, Qichao Zhang, Dongbin Zhao, Yuzheng Zhuang, Bin Wang, Wulong Liu, Rasul Tutunov, Jun Wang
To address the long-term memory issue, this paper proposes a graph attention memory (GAM) architecture consisting of memory construction module, graph attention module and control module.