no code implementations • 18 Jan 2024 • Hui Jiao, Bei Peng, Lu Zong, Xiaojun Zhang, Xinwei Li
ChatGPT, as a language model based on large-scale pre-training, has exerted a profound influence on the domain of machine translation.
no code implementations • 18 Sep 2023 • Benyang Gong, Jiacheng He, Gang Wang, Bei Peng
This brief optimizes TKF by using the Gaussian mixture model(GMM), which generates a reasonable covariance matrix from the measurement noise to replace the one used in the existing algorithm and breaks the adjustment limit of the confidence level.
no code implementations • 15 Sep 2023 • Jiacheng He, Shan Zhong, Bei Peng, Gang Wang, Qizhen Wang
In multi-target tracking (MTT), non-Gaussian measurement noise from sensors can diminish the performance of the Gaussian-assumed Gaussian mixture probability hypothesis density (GM-PHD) filter.
1 code implementation • 12 Sep 2023 • Tianhui Zhang, Danushka Bollegala, Bei Peng
Prior work has shown that the ordering in which concepts are shown to a commonsense generator plays an important role, affecting the quality of the generated sentence.
no code implementations • 4 Jul 2023 • Jiacheng He, Bei Peng, Zhenyu Feng, Xuemei Mao, Song Gao, Gang Wang
In this paper, a generalized packet drop model is proposed to describe the packet loss phenomenon caused by DoS attacks and other factors.
no code implementations • 20 Jun 2023 • Xuemei Mao, Gang Wang, Bei Peng, Jiacheng He, Kun Zhang, Song Gao
A DKF, called model fusion DKF (MFDKF) is proposed against the non-Gaussain noise.
no code implementations • 30 May 2023 • Flora Charbonnier, Bei Peng, Thomas Morstyn, Malcolm McCulloch
This paper investigates how deep multi-agent reinforcement learning can enable the scalable and privacy-preserving coordination of residential energy flexibility.
no code implementations • 14 Jan 2023 • Jiacheng He, Hongwei Wang, Gang Wang, Shan Zhong, Bei Peng
Outliers and impulsive disturbances often cause heavy-tailed distributions in practical applications, and these will degrade the performance of Gaussian approximation smoothing algorithms.
no code implementations • 14 Jan 2023 • Jiacheng He, Gang Wang, Xuemei Mao, Song Gao, Bei Peng
Distributed Kalman filter approaches based on the maximum correntropy criterion have recently demonstrated superior state estimation performance to that of conventional distributed Kalman filters for wireless sensor networks in the presence of non-Gaussian impulsive noise.
no code implementations • 6 Dec 2022 • Lin Shi, Bei Peng
In multi-agent reinforcement learning (MARL), many popular methods, such as VDN and QMIX, are susceptible to a critical multi-agent pathology known as relative overgeneralization (RO), which arises when the optimal joint action's utility falls below that of a sub-optimal joint action in cooperative tasks.
no code implementations • 29 Sep 2022 • Gabriella Pizzuto, Hetong Wang, Hatem Fakhruldeen, Bei Peng, Kevin S. Luck, Andrew I. Cooper
Motivated by how human chemists carry out this process of scraping powder from vials, our work proposes a model-free reinforcement learning method for learning a scraping policy, leading to a fully autonomous sample scraping procedure.
1 code implementation • NeurIPS 2021 • Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson
Tackling overestimation in $Q$-learning is an important problem that has been extensively studied in single-agent reinforcement learning, but has received comparatively little attention in the multi-agent setting.
1 code implementation • 8 Sep 2021 • Jiacheng He, Gang Wang, Bei Peng, Zhenyu Feng, Kun Zhang
In our study, a novel concept, called generalized error entropy, utilizing the generalized Gaussian density (GGD) function as the kernel function is proposed.
no code implementations • 1 Jul 2021 • Kai Liu, Yuyang Zhao, Gang Wang, Bei Peng
Cooperative problems under continuous control have always been the focus of multi-agent reinforcement learning.
no code implementations • 27 Apr 2021 • Bozhidar Vasilev, Tarun Gupta, Bei Peng, Shimon Whiteson
Policy gradient methods are an attractive approach to multi-agent reinforcement learning problems due to their convergence properties and robustness in partially observable scenarios.
no code implementations • 22 Mar 2021 • Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson
Tackling overestimation in $Q$-learning is an important problem that has been extensively studied in single-agent reinforcement learning, but has received comparatively little attention in the multi-agent setting.
no code implementations • 6 Oct 2020 • Tarun Gupta, Anuj Mahajan, Bei Peng, Wendelin Böhmer, Shimon Whiteson
VDN and QMIX are two popular value-based algorithms for cooperative MARL that learn a centralized action value function as a monotonic mixing of per-agent utilities.
Multi-agent Reinforcement Learning reinforcement-learning +3
2 code implementations • ICLR 2021 • Tonghan Wang, Tarun Gupta, Anuj Mahajan, Bei Peng, Shimon Whiteson, Chongjie Zhang
Learning a role selector based on action effects makes role discovery much easier because it forms a bi-level learning hierarchy -- the role selector searches in a smaller role space and at a lower temporal resolution, while role policies learn in significantly reduced primitive action-observation spaces.
4 code implementations • NeurIPS 2020 • Tabish Rashid, Gregory Farquhar, Bei Peng, Shimon Whiteson
We show in particular that this projection can fail to recover the optimal policy even with access to $Q^*$, which primarily stems from the equal weighting placed on each joint action.
2 code implementations • 7 Jun 2020 • Shariq Iqbal, Christian A. Schroeder de Witt, Bei Peng, Wendelin Böhmer, Shimon Whiteson, Fei Sha
Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities; however, common patterns of behavior often emerge among these agents/entities.
3 code implementations • NeurIPS 2021 • Bei Peng, Tabish Rashid, Christian A. Schroeder de Witt, Pierre-Alexandre Kamienny, Philip H. S. Torr, Wendelin Böhmer, Shimon Whiteson
We propose FACtored Multi-Agent Centralised policy gradients (FACMAC), a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces.
no code implementations • 10 Mar 2020 • Sanmit Narvekar, Bei Peng, Matteo Leonetti, Jivko Sinapov, Matthew E. Taylor, Peter Stone
Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback.
1 code implementation • ICLR 2020 • Tabish Rashid, Bei Peng, Wendelin Böhmer, Shimon Whiteson
We show that this scheme is provably efficient in the tabular setting and extend it to the deep RL setting.
no code implementations • 29 Nov 2019 • Leo Feng, Luisa Zintgraf, Bei Peng, Shimon Whiteson
In few-shot learning, typically, the loss function which is applied at test time is the one we are ultimately interested in minimising, such as the mean-squared-error loss for a regression problem.
no code implementations • 19 Jul 2019 • Robert Loftin, Bei Peng, Matthew E. Taylor, Michael L. Littman, David L. Roberts
In order for robots and other artificial agents to efficiently learn to perform useful tasks defined by an end user, they must understand not only the goals of those tasks, but also the structure and dynamics of that user's environment.
1 code implementation • ICLR 2018 • Jiechao Xiong, Qing Wang, Zhuoran Yang, Peng Sun, Yang Zheng, Lei Han, Haobo Fu, Xiangru Lian, Carson Eisenach, Haichuan Yang, Emmanuel Ekwedike, Bei Peng, Haoyue Gao, Tong Zhang, Ji Liu, Han Liu
Most existing deep reinforcement learning (DRL) frameworks consider action spaces that are either discrete or continuous space.
no code implementations • ICML 2017 • James MacGlashan, Mark K. Ho, Robert Loftin, Bei Peng, Guan Wang, David Roberts, Matthew E. Taylor, Michael L. Littman
This paper investigates the problem of interactively learning behaviors communicated by a human teacher using positive and negative feedback.