Search Results for author: Bei Peng

Found 27 papers, 9 papers with code

Gradable ChatGPT Translation Evaluation

no code implementations • 18 Jan 2024 • Hui Jiao, Bei Peng, Lu Zong, Xiaojun Zhang, Xinwei Li

ChatGPT, as a language model based on large-scale pre-training, has exerted a profound influence on the domain of machine translation.

Language Modelling Machine Translation +2

Paper
Add Code

A Covariance Adaptive Student's t Based Kalman Filter

no code implementations • 18 Sep 2023 • Benyang Gong, Jiacheng He, Gang Wang, Bei Peng

This brief optimizes TKF by using the Gaussian mixture model(GMM), which generates a reasonable covariance matrix from the measurement noise to replace the one used in the existing algorithm and breaks the adjustment limit of the confidence level.

Paper
Add Code

Interactive Model Fusion-Based GM-PHD Filter

no code implementations • 15 Sep 2023 • Jiacheng He, Shan Zhong, Bei Peng, Gang Wang, Qizhen Wang

In multi-target tracking (MTT), non-Gaussian measurement noise from sensors can diminish the performance of the Gaussian-assumed Gaussian mixture probability hypothesis density (GM-PHD) filter.

Paper
Add Code

Learning to Predict Concept Ordering for Common Sense Generation

1 code implementation • 12 Sep 2023 • Tianhui Zhang, Danushka Bollegala, Bei Peng

Prior work has shown that the ordering in which concepts are shown to a commonsense generator plays an important role, affecting the quality of the generated sentence.

Common Sense Reasoning Sentence

Paper
Code

Distributed fusion filter over lossy wireless sensor networks with the presence of non-Gaussian noise

no code implementations • 4 Jul 2023 • Jiacheng He, Bei Peng, Zhenyu Feng, Xuemei Mao, Song Gao, Gang Wang

In this paper, a generalized packet drop model is proposed to describe the packet loss phenomenon caused by DoS attacks and other factors.

Paper
Add Code

A Model Fusion Distributed Kalman Filter For Non-Gaussian Observation Noise

no code implementations • 20 Jun 2023 • Xuemei Mao, Gang Wang, Bei Peng, Jiacheng He, Kun Zhang, Song Gao

A DKF, called model fusion DKF (MFDKF) is proposed against the non-Gaussain noise.

Paper
Add Code

Centralised rehearsal of decentralised cooperation: Multi-agent reinforcement learning for the scalable coordination of residential energy flexibility

no code implementations • 30 May 2023 • Flora Charbonnier, Bei Peng, Thomas Morstyn, Malcolm McCulloch

This paper investigates how deep multi-agent reinforcement learning can enable the scalable and privacy-preserving coordination of residential energy flexibility.

Multi-agent Reinforcement Learning Privacy Preserving +1

Paper
Add Code

Minimum Error Entropy Rauch-Tung-Striebel Smoother

no code implementations • 14 Jan 2023 • Jiacheng He, Hongwei Wang, Gang Wang, Shan Zhong, Bei Peng

Outliers and impulsive disturbances often cause heavy-tailed distributions in practical applications, and these will degrade the performance of Gaussian approximation smoothing algorithms.

Paper
Add Code

State Estimation of Wireless Sensor Networks in the Presence of Data Packet Drops and Non-Gaussian Noise

no code implementations • 14 Jan 2023 • Jiacheng He, Gang Wang, Xuemei Mao, Song Gao, Bei Peng

Distributed Kalman filter approaches based on the maximum correntropy criterion have recently demonstrated superior state estimation performance to that of conventional distributed Kalman filters for wireless sensor networks in the presence of non-Gaussian impulsive noise.

Paper
Add Code

Curriculum Learning for Relative Overgeneralization

no code implementations • 6 Dec 2022 • Lin Shi, Bei Peng

In multi-agent reinforcement learning (MARL), many popular methods, such as VDN and QMIX, are susceptible to a critical multi-agent pathology known as relative overgeneralization (RO), which arises when the optimal joint action's utility falls below that of a sub-optimal joint action in cooperative tasks.

Efficient Exploration Multi-agent Reinforcement Learning +3

Paper
Add Code

Accelerating Laboratory Automation Through Robot Skill Learning For Sample Scraping

no code implementations • 29 Sep 2022 • Gabriella Pizzuto, Hetong Wang, Hatem Fakhruldeen, Bei Peng, Kevin S. Luck, Andrew I. Cooper

Motivated by how human chemists carry out this process of scraping powder from vials, our work proposes a model-free reinforcement learning method for learning a scraping policy, leading to a fully autonomous sample scraping procedure.

Paper
Add Code

Regularized Softmax Deep Multi-Agent Q-Learning

1 code implementation • NeurIPS 2021 • Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson

Tackling overestimation in $Q$-learning is an important problem that has been extensively studied in single-agent reinforcement learning, but has received comparatively little attention in the multi-agent setting.

Multi-agent Reinforcement Learning Q-Learning +4

Paper
Code

Generalized Minimum Error Entropy for Adaptive Filtering

1 code implementation • 8 Sep 2021 • Jiacheng He, Gang Wang, Bei Peng, Zhenyu Feng, Kun Zhang

In our study, a novel concept, called generalized error entropy, utilizing the generalized Gaussian density (GGD) function as the kernel function is proposed.

Paper
Code

SA-MATD3:Self-attention-based multi-agent continuous control method in cooperative environments

no code implementations • 1 Jul 2021 • Kai Liu, Yuyang Zhao, Gang Wang, Bei Peng

Cooperative problems under continuous control have always been the focus of multi-agent reinforcement learning.

Continuous Control Multi-agent Reinforcement Learning

Paper
Add Code

Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients

no code implementations • 27 Apr 2021 • Bozhidar Vasilev, Tarun Gupta, Bei Peng, Shimon Whiteson

Policy gradient methods are an attractive approach to multi-agent reinforcement learning problems due to their convergence properties and robustness in partially observable scenarios.

Policy Gradient Methods Reinforcement Learning (RL) +2

Paper
Add Code

Regularized Softmax Deep Multi-Agent $Q$-Learning

no code implementations • 22 Mar 2021 • Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson

Multi-agent Reinforcement Learning Q-Learning +4

Paper
Add Code

UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning

no code implementations • 6 Oct 2020 • Tarun Gupta, Anuj Mahajan, Bei Peng, Wendelin Böhmer, Shimon Whiteson

VDN and QMIX are two popular value-based algorithms for cooperative MARL that learn a centralized action value function as a monotonic mixing of per-agent utilities.

Multi-agent Reinforcement Learning reinforcement-learning +3

Paper
Add Code

RODE: Learning Roles to Decompose Multi-Agent Tasks

2 code implementations • ICLR 2021 • Tonghan Wang, Tarun Gupta, Anuj Mahajan, Bei Peng, Shimon Whiteson, Chongjie Zhang

Learning a role selector based on action effects makes role discovery much easier because it forms a bi-level learning hierarchy -- the role selector searches in a smaller role space and at a lower temporal resolution, while role policies learn in significantly reduced primitive action-observation spaces.

Clustering Starcraft +1

Paper
Code

Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

4 code implementations • NeurIPS 2020 • Tabish Rashid, Gregory Farquhar, Bei Peng, Shimon Whiteson

We show in particular that this projection can fail to recover the optimal policy even with access to $Q^*$, which primarily stems from the equal weighting placed on each joint action.

Multi-agent Reinforcement Learning Q-Learning +3

2,534

Paper
Code

Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning

2 code implementations • 7 Jun 2020 • Shariq Iqbal, Christian A. Schroeder de Witt, Bei Peng, Wendelin Böhmer, Shimon Whiteson, Fei Sha

Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities; however, common patterns of behavior often emerge among these agents/entities.

counterfactual Multi-agent Reinforcement Learning +3

Paper
Code

FACMAC: Factored Multi-Agent Centralised Policy Gradients

3 code implementations • NeurIPS 2021 • Bei Peng, Tabish Rashid, Christian A. Schroeder de Witt, Pierre-Alexandre Kamienny, Philip H. S. Torr, Wendelin Böhmer, Shimon Whiteson

We propose FACtored Multi-Agent Centralised policy gradients (FACMAC), a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces.

Q-Learning SMAC +2

311

Paper
Code

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

no code implementations • 10 Mar 2020 • Sanmit Narvekar, Bei Peng, Matteo Leonetti, Jivko Sinapov, Matthew E. Taylor, Peter Stone

Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Optimistic Exploration even with a Pessimistic Initialisation

1 code implementation • ICLR 2020 • Tabish Rashid, Bei Peng, Wendelin Böhmer, Shimon Whiteson

We show that this scheme is provably efficient in the tabular setting and extend it to the deep RL setting.

Efficient Exploration Q-Learning +1

Paper
Code

VIABLE: Fast Adaptation via Backpropagating Learned Loss

no code implementations • 29 Nov 2019 • Leo Feng, Luisa Zintgraf, Bei Peng, Shimon Whiteson

In few-shot learning, typically, the loss function which is applied at test time is the one we are ultimately interested in minimising, such as the mean-squared-error loss for a regression problem.

Few-Shot Learning regression

Paper
Add Code

Interactive Learning of Environment Dynamics for Sequential Tasks

no code implementations • 19 Jul 2019 • Robert Loftin, Bei Peng, Matthew E. Taylor, Michael L. Littman, David L. Roberts

In order for robots and other artificial agents to efficiently learn to perform useful tasks defined by an end user, they must understand not only the goals of those tasks, but also the structure and dynamics of that user's environment.

Paper
Add Code

PARAMETRIZED DEEP Q-NETWORKS LEARNING: PLAYING ONLINE BATTLE ARENA WITH DISCRETE-CONTINUOUS HYBRID ACTION SPACE

1 code implementation • ICLR 2018 • Jiechao Xiong, Qing Wang, Zhuoran Yang, Peng Sun, Yang Zheng, Lei Han, Haobo Fu, Xiangru Lian, Carson Eisenach, Haichuan Yang, Emmanuel Ekwedike, Bei Peng, Haoyue Gao, Tong Zhang, Ji Liu, Han Liu

Most existing deep reinforcement learning (DRL) frameworks consider action spaces that are either discrete or continuous space.

2,534

Paper
Code

Interactive Learning from Policy-Dependent Human Feedback

no code implementations • ICML 2017 • James MacGlashan, Mark K. Ho, Robert Loftin, Bei Peng, Guan Wang, David Roberts, Matthew E. Taylor, Michael L. Littman

This paper investigates the problem of interactively learning behaviors communicated by a human teacher using positive and negative feedback.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.