Search Results for author: Mingfei Sun

Found 26 papers, 10 papers with code

Low-Rank Agent-Specific Adaptation (LoRASA) for Multi-Agent Policy Learning

no code implementations8 Feb 2025 Beining Zhang, Aditya Kapoor, Mingfei Sun

We propose \textbf{Low-Rank Agent-Specific Adaptation (LoRASA)}, a novel approach that treats each agent's policy as a specialized ``task'' fine-tuned from a shared backbone.

MuJoCo SMAC+ +1

Agent-Temporal Credit Assignment for Optimal Policy Preservation in Sparse Multi-Agent Reinforcement Learning

no code implementations19 Dec 2024 Aditya Kapoor, Sushant Swamy, Kale-ab Tessera, Mayank Baranwal, Mingfei Sun, Harshad Khadilkar, Stefano V. Albrecht

In multi-agent environments, agents often struggle to learn optimal policies due to sparse or delayed global rewards, particularly in long-horizon tasks where it is challenging to evaluate actions at intermediate time steps.

Multi-agent Reinforcement Learning reinforcement-learning +2

LoKO: Low-Rank Kalman Optimizer for Online Fine-Tuning of Large Models

no code implementations15 Oct 2024 Hossein Abdi, Mingfei Sun, Andi Zhang, Samuel Kaski, Wei Pan

Training large models with millions or even billions of parameters from scratch incurs substantial computational costs.

parameter-efficient fine-tuning

Effective Generation of Feasible Solutions for Integer Programming via Guided Diffusion

1 code implementation18 Jun 2024 Hao Zeng, Jiaqi Wang, Avirup Das, Junying He, Kunpeng Han, Haoyuan Hu, Mingfei Sun

We empirically evaluate our framework on four typical datasets of IP problems, and show that it effectively generates complete feasible solutions with a high probability (> 89. 7 \%) without the reliance of Solvers and the quality of solutions is comparable to the best heuristic solutions from Gurobi.

Contrastive Learning

Comparing the Efficacy of Fine-Tuning and Meta-Learning for Few-Shot Policy Imitation

4 code implementations23 Jun 2023 Massimiliano Patacchiola, Mingfei Sun, Katja Hofmann, Richard E. Turner

Despite its simplicity this baseline is competitive with meta-learning methods on a variety of conditions and is able to imitate target policies trained on unseen variations of the original environment.

Few-Shot Image Classification Few-Shot Imitation Learning +4

Trust-Region-Free Policy Optimization for Stochastic Policies

no code implementations15 Feb 2023 Mingfei Sun, Benjamin Ellis, Anuj Mahajan, Sam Devlin, Katja Hofmann, Shimon Whiteson

In this paper, we show that the trust region constraint over policies can be safely substituted by a trust-region-free constraint without compromising the underlying monotonic improvement guarantee.

Sample Dropout: A Simple yet Effective Variance Reduction Technique in Deep Policy Optimization

1 code implementation5 Feb 2023 Zichuan Lin, Xiapeng Wu, Mingfei Sun, Deheng Ye, Qiang Fu, Wei Yang, Wei Liu

Recent success in Deep Reinforcement Learning (DRL) methods has shown that policy optimization with respect to an off-policy distribution via importance sampling is effective for sample reuse.

Deep Reinforcement Learning MuJoCo

Imitating Human Behaviour with Diffusion Models

1 code implementation25 Jan 2023 Tim Pearce, Tabish Rashid, Anssi Kanervisto, Dave Bignell, Mingfei Sun, Raluca Georgescu, Sergio Valcarcel Macua, Shan Zheng Tan, Ida Momennejad, Katja Hofmann, Sam Devlin

This paper studies their application as observation-to-action models for imitating human behaviour in sequential environments.

Revisiting Estimation Bias in Policy Gradients for Deep Reinforcement Learning

no code implementations20 Jan 2023 Haoxuan Pan, Deheng Ye, Xiaoming Duan, Qiang Fu, Wei Yang, Jianping He, Mingfei Sun

We show that, despite such state distribution shift, the policy gradient estimation bias can be reduced in the following three ways: 1) a small learning rate; 2) an adaptive-learning-rate-based optimizer; and 3) KL regularization.

continuous-control Continuous Control +3

Trust Region Bounds for Decentralized PPO Under Non-stationarity

no code implementations31 Jan 2022 Mingfei Sun, Sam Devlin, Jacob Beck, Katja Hofmann, Shimon Whiteson

We present trust region bounds for optimizing decentralized policies in cooperative Multi-Agent Reinforcement Learning (MARL), which holds even when the transition dynamics are non-stationary.

Multi-agent Reinforcement Learning

Generalization in Cooperative Multi-Agent Systems

no code implementations31 Jan 2022 Anuj Mahajan, Mikayel Samvelyan, Tarun Gupta, Benjamin Ellis, Mingfei Sun, Tim Rocktäschel, Shimon Whiteson

Specifically, we study generalization bounds under a linear dependence of the underlying dynamics on the agent capabilities, which can be seen as a generalization of Successor Features to MAS.

Generalization Bounds Multi-agent Reinforcement Learning

You May Not Need Ratio Clipping in PPO

no code implementations31 Jan 2022 Mingfei Sun, Vitaly Kurin, Guoqing Liu, Sam Devlin, Tao Qin, Katja Hofmann, Shimon Whiteson

Furthermore, we show that ESPO can be easily scaled up to distributed training with many workers, delivering strong performance as well.

continuous-control Continuous Control

Birds Eye View Social Distancing Analysis System

no code implementations14 Dec 2021 Zhengye Yang, Mingfei Sun, Hongzhe Ye, Zihao Xiong, Gil Zussman, Zoran Kostic

We propose and evaluate a privacy-preserving social distancing analysis system (B-SDA), which uses bird's-eye view video recordings of pedestrians who cross traffic intersections.

object-detection Object Detection +2

Supervised Learning Achieves Human-Level Performance in MOBA Games: A Case Study of Honor of Kings

no code implementations25 Nov 2020 Deheng Ye, Guibin Chen, Peilin Zhao, Fuhao Qiu, Bo Yuan, Wen Zhang, Sheng Chen, Mingfei Sun, Xiaoqian Li, Siqin Li, Jing Liang, Zhenjie Lian, Bei Shi, Liang Wang, Tengfei Shi, Qiang Fu, Wei Yang, Lanxiao Huang

Unlike prior attempts, we integrate the macro-strategy and the micromanagement of MOBA-game-playing into neural networks in a supervised and end-to-end manner.

Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?

7 code implementations18 Nov 2020 Christian Schroeder de Witt, Tarun Gupta, Denys Makoviichuk, Viktor Makoviychuk, Philip H. S. Torr, Mingfei Sun, Shimon Whiteson

Most recently developed approaches to cooperative multi-agent reinforcement learning in the \emph{centralized training with decentralized execution} setting involve estimating a centralized, joint value function.

All reinforcement-learning +3

Adversarial Imitation Learning from Incomplete Demonstrations

1 code implementation29 May 2019 Mingfei Sun, Xiaojuan Ma

In this paper, we propose a novel algorithm called Action-Guided Adversarial Imitation Learning (AGAIL) that learns a policy from demonstrations with incomplete action sequences, i. e., incomplete demonstrations.

Imitation Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.