Search Results for author: Bei Peng

Found 27 papers, 9 papers with code

Gradable ChatGPT Translation Evaluation

no code implementations18 Jan 2024 Hui Jiao, Bei Peng, Lu Zong, Xiaojun Zhang, Xinwei Li

ChatGPT, as a language model based on large-scale pre-training, has exerted a profound influence on the domain of machine translation.

Language Modelling Machine Translation +2

A Covariance Adaptive Student's t Based Kalman Filter

no code implementations18 Sep 2023 Benyang Gong, Jiacheng He, Gang Wang, Bei Peng

This brief optimizes TKF by using the Gaussian mixture model(GMM), which generates a reasonable covariance matrix from the measurement noise to replace the one used in the existing algorithm and breaks the adjustment limit of the confidence level.

Interactive Model Fusion-Based GM-PHD Filter

no code implementations15 Sep 2023 Jiacheng He, Shan Zhong, Bei Peng, Gang Wang, Qizhen Wang

In multi-target tracking (MTT), non-Gaussian measurement noise from sensors can diminish the performance of the Gaussian-assumed Gaussian mixture probability hypothesis density (GM-PHD) filter.

Learning to Predict Concept Ordering for Common Sense Generation

1 code implementation12 Sep 2023 Tianhui Zhang, Danushka Bollegala, Bei Peng

Prior work has shown that the ordering in which concepts are shown to a commonsense generator plays an important role, affecting the quality of the generated sentence.

Common Sense Reasoning Sentence

Distributed fusion filter over lossy wireless sensor networks with the presence of non-Gaussian noise

no code implementations4 Jul 2023 Jiacheng He, Bei Peng, Zhenyu Feng, Xuemei Mao, Song Gao, Gang Wang

In this paper, a generalized packet drop model is proposed to describe the packet loss phenomenon caused by DoS attacks and other factors.

A Model Fusion Distributed Kalman Filter For Non-Gaussian Observation Noise

no code implementations20 Jun 2023 Xuemei Mao, Gang Wang, Bei Peng, Jiacheng He, Kun Zhang, Song Gao

A DKF, called model fusion DKF (MFDKF) is proposed against the non-Gaussain noise.

Minimum Error Entropy Rauch-Tung-Striebel Smoother

no code implementations14 Jan 2023 Jiacheng He, Hongwei Wang, Gang Wang, Shan Zhong, Bei Peng

Outliers and impulsive disturbances often cause heavy-tailed distributions in practical applications, and these will degrade the performance of Gaussian approximation smoothing algorithms.

State Estimation of Wireless Sensor Networks in the Presence of Data Packet Drops and Non-Gaussian Noise

no code implementations14 Jan 2023 Jiacheng He, Gang Wang, Xuemei Mao, Song Gao, Bei Peng

Distributed Kalman filter approaches based on the maximum correntropy criterion have recently demonstrated superior state estimation performance to that of conventional distributed Kalman filters for wireless sensor networks in the presence of non-Gaussian impulsive noise.

Curriculum Learning for Relative Overgeneralization

no code implementations6 Dec 2022 Lin Shi, Bei Peng

In multi-agent reinforcement learning (MARL), many popular methods, such as VDN and QMIX, are susceptible to a critical multi-agent pathology known as relative overgeneralization (RO), which arises when the optimal joint action's utility falls below that of a sub-optimal joint action in cooperative tasks.

Efficient Exploration Multi-agent Reinforcement Learning +3

Accelerating Laboratory Automation Through Robot Skill Learning For Sample Scraping

no code implementations29 Sep 2022 Gabriella Pizzuto, Hetong Wang, Hatem Fakhruldeen, Bei Peng, Kevin S. Luck, Andrew I. Cooper

Motivated by how human chemists carry out this process of scraping powder from vials, our work proposes a model-free reinforcement learning method for learning a scraping policy, leading to a fully autonomous sample scraping procedure.

Regularized Softmax Deep Multi-Agent Q-Learning

1 code implementation NeurIPS 2021 Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson

Tackling overestimation in $Q$-learning is an important problem that has been extensively studied in single-agent reinforcement learning, but has received comparatively little attention in the multi-agent setting.

Multi-agent Reinforcement Learning Q-Learning +4

Generalized Minimum Error Entropy for Adaptive Filtering

1 code implementation8 Sep 2021 Jiacheng He, Gang Wang, Bei Peng, Zhenyu Feng, Kun Zhang

In our study, a novel concept, called generalized error entropy, utilizing the generalized Gaussian density (GGD) function as the kernel function is proposed.

Semi-On-Policy Training for Sample Efficient Multi-Agent Policy Gradients

no code implementations27 Apr 2021 Bozhidar Vasilev, Tarun Gupta, Bei Peng, Shimon Whiteson

Policy gradient methods are an attractive approach to multi-agent reinforcement learning problems due to their convergence properties and robustness in partially observable scenarios.

Policy Gradient Methods Reinforcement Learning (RL) +2

Regularized Softmax Deep Multi-Agent $Q$-Learning

no code implementations22 Mar 2021 Ling Pan, Tabish Rashid, Bei Peng, Longbo Huang, Shimon Whiteson

Tackling overestimation in $Q$-learning is an important problem that has been extensively studied in single-agent reinforcement learning, but has received comparatively little attention in the multi-agent setting.

Multi-agent Reinforcement Learning Q-Learning +4

UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning

no code implementations6 Oct 2020 Tarun Gupta, Anuj Mahajan, Bei Peng, Wendelin Böhmer, Shimon Whiteson

VDN and QMIX are two popular value-based algorithms for cooperative MARL that learn a centralized action value function as a monotonic mixing of per-agent utilities.

Multi-agent Reinforcement Learning reinforcement-learning +3

RODE: Learning Roles to Decompose Multi-Agent Tasks

2 code implementations ICLR 2021 Tonghan Wang, Tarun Gupta, Anuj Mahajan, Bei Peng, Shimon Whiteson, Chongjie Zhang

Learning a role selector based on action effects makes role discovery much easier because it forms a bi-level learning hierarchy -- the role selector searches in a smaller role space and at a lower temporal resolution, while role policies learn in significantly reduced primitive action-observation spaces.

Clustering Starcraft +1

Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

4 code implementations NeurIPS 2020 Tabish Rashid, Gregory Farquhar, Bei Peng, Shimon Whiteson

We show in particular that this projection can fail to recover the optimal policy even with access to $Q^*$, which primarily stems from the equal weighting placed on each joint action.

Multi-agent Reinforcement Learning Q-Learning +3

Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning

2 code implementations7 Jun 2020 Shariq Iqbal, Christian A. Schroeder de Witt, Bei Peng, Wendelin Böhmer, Shimon Whiteson, Fei Sha

Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities; however, common patterns of behavior often emerge among these agents/entities.

counterfactual Multi-agent Reinforcement Learning +3

FACMAC: Factored Multi-Agent Centralised Policy Gradients

3 code implementations NeurIPS 2021 Bei Peng, Tabish Rashid, Christian A. Schroeder de Witt, Pierre-Alexandre Kamienny, Philip H. S. Torr, Wendelin Böhmer, Shimon Whiteson

We propose FACtored Multi-Agent Centralised policy gradients (FACMAC), a new method for cooperative multi-agent reinforcement learning in both discrete and continuous action spaces.

Q-Learning SMAC +2

Curriculum Learning for Reinforcement Learning Domains: A Framework and Survey

no code implementations10 Mar 2020 Sanmit Narvekar, Bei Peng, Matteo Leonetti, Jivko Sinapov, Matthew E. Taylor, Peter Stone

Reinforcement learning (RL) is a popular paradigm for addressing sequential decision tasks in which the agent has only limited environmental feedback.

reinforcement-learning Reinforcement Learning (RL) +1

VIABLE: Fast Adaptation via Backpropagating Learned Loss

no code implementations29 Nov 2019 Leo Feng, Luisa Zintgraf, Bei Peng, Shimon Whiteson

In few-shot learning, typically, the loss function which is applied at test time is the one we are ultimately interested in minimising, such as the mean-squared-error loss for a regression problem.

Few-Shot Learning regression

Interactive Learning of Environment Dynamics for Sequential Tasks

no code implementations19 Jul 2019 Robert Loftin, Bei Peng, Matthew E. Taylor, Michael L. Littman, David L. Roberts

In order for robots and other artificial agents to efficiently learn to perform useful tasks defined by an end user, they must understand not only the goals of those tasks, but also the structure and dynamics of that user's environment.

Interactive Learning from Policy-Dependent Human Feedback

no code implementations ICML 2017 James MacGlashan, Mark K. Ho, Robert Loftin, Bei Peng, Guan Wang, David Roberts, Matthew E. Taylor, Michael L. Littman

This paper investigates the problem of interactively learning behaviors communicated by a human teacher using positive and negative feedback.

Cannot find the paper you are looking for? You can Submit a new open access paper.