Search Results for author: Mingfei Sun

Found 21 papers, 9 papers with code

Meta-Reinforcement Learning for Mastering Multiple Skills and Generalizing across Environments in Text-based Games

no code implementations • ACL (MetaNLP) 2021 • Zhenjie Zhao, Mingfei Sun, Xiaojuan Ma

In this paper, we propose a meta reinforcement learning based method to train text agents through learning-to-explore.

Imitation Learning Meta Reinforcement Learning +3

Paper
Add Code

FARPLS: A Feature-Augmented Robot Trajectory Preference Labeling System to Assist Human Labelers' Preference Elicitation

no code implementations • 10 Mar 2024 • Hanfang Lyu, Yuanchen Bai, Xin Liang, Ujaan Das, Chuhan Shi, Leiliang Gong, Yingchi Li, Mingfei Sun, Ming Ge, Xiaojuan Ma

Preference-based learning aims to align robot task objectives with human values.

Paper
Add Code

TTA-Nav: Test-time Adaptive Reconstruction for Point-Goal Navigation under Visual Corruptions

1 code implementation • 4 Mar 2024 • Maytus Piriyajitakonkij, Mingfei Sun, Mengmi Zhang, Wei Pan

Our "plug-and-play" method incorporates a top-down decoder to a pre-trained navigation model.

Robot Navigation Test-time Adaptation +1

Paper
Code

Comparing the Efficacy of Fine-Tuning and Meta-Learning for Few-Shot Policy Imitation

1 code implementation • 23 Jun 2023 • Massimiliano Patacchiola, Mingfei Sun, Katja Hofmann, Richard E. Turner

Despite its simplicity this baseline is competitive with meta-learning methods on a variety of conditions and is able to imitate target policies trained on unseen variations of the original environment.

Few-Shot Image Classification Few-Shot Imitation Learning +3

Paper
Code

Trust-Region-Free Policy Optimization for Stochastic Policies

no code implementations • 15 Feb 2023 • Mingfei Sun, Benjamin Ellis, Anuj Mahajan, Sam Devlin, Katja Hofmann, Shimon Whiteson

In this paper, we show that the trust region constraint over policies can be safely substituted by a trust-region-free constraint without compromising the underlying monotonic improvement guarantee.

Paper
Add Code

Sample Dropout: A Simple yet Effective Variance Reduction Technique in Deep Policy Optimization

1 code implementation • 5 Feb 2023 • Zichuan Lin, Xiapeng Wu, Mingfei Sun, Deheng Ye, Qiang Fu, Wei Yang, Wei Liu

Recent success in Deep Reinforcement Learning (DRL) methods has shown that policy optimization with respect to an off-policy distribution via importance sampling is effective for sample reuse.

Paper
Code

Imitating Human Behaviour with Diffusion Models

1 code implementation • 25 Jan 2023 • Tim Pearce, Tabish Rashid, Anssi Kanervisto, Dave Bignell, Mingfei Sun, Raluca Georgescu, Sergio Valcarcel Macua, Shan Zheng Tan, Ida Momennejad, Katja Hofmann, Sam Devlin

This paper studies their application as observation-to-action models for imitating human behaviour in sequential environments.

106

Paper
Code

Revisiting Estimation Bias in Policy Gradients for Deep Reinforcement Learning

no code implementations • 20 Jan 2023 • Haoxuan Pan, Deheng Ye, Xiaoming Duan, Qiang Fu, Wei Yang, Jianping He, Mingfei Sun

We show that, despite such state distribution shift, the policy gradient estimation bias can be reduced in the following three ways: 1) a small learning rate; 2) an adaptive-learning-rate-based optimizer; and 3) KL regularization.

Continuous Control reinforcement-learning +1

Paper
Add Code

SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning

1 code implementation • NeurIPS 2023 • Benjamin Ellis, Jonathan Cook, Skander Moalla, Mikayel Samvelyan, Mingfei Sun, Anuj Mahajan, Jakob N. Foerster, Shimon Whiteson

In this work, we conduct new analysis demonstrating that SMAC lacks the stochasticity and partial observability to require complex *closed-loop* policies.

reinforcement-learning SMAC+ +1

166

Paper
Code

UniMASK: Unified Inference in Sequential Decision Problems

1 code implementation • 20 Nov 2022 • Micah Carroll, Orr Paradise, Jessy Lin, Raluca Georgescu, Mingfei Sun, David Bignell, Stephanie Milani, Katja Hofmann, Matthew Hausknecht, Anca Dragan, Sam Devlin

Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks.

Decision Making

Paper
Code

Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers

no code implementations • 28 Apr 2022 • Micah Carroll, Jessy Lin, Orr Paradise, Raluca Georgescu, Mingfei Sun, David Bignell, Stephanie Milani, Katja Hofmann, Matthew Hausknecht, Anca Dragan, Sam Devlin

Randomly masking and predicting word tokens has been a successful approach in pre-training language models for a variety of downstream tasks.

Decision Making Offline RL

Paper
Add Code

Trust Region Bounds for Decentralized PPO Under Non-stationarity

no code implementations • 31 Jan 2022 • Mingfei Sun, Sam Devlin, Jacob Beck, Katja Hofmann, Shimon Whiteson

We present trust region bounds for optimizing decentralized policies in cooperative Multi-Agent Reinforcement Learning (MARL), which holds even when the transition dynamics are non-stationary.

Multi-agent Reinforcement Learning

Paper
Add Code

Generalization in Cooperative Multi-Agent Systems

no code implementations • 31 Jan 2022 • Anuj Mahajan, Mikayel Samvelyan, Tarun Gupta, Benjamin Ellis, Mingfei Sun, Tim Rocktäschel, Shimon Whiteson

Specifically, we study generalization bounds under a linear dependence of the underlying dynamics on the agent capabilities, which can be seen as a generalization of Successor Features to MAS.

Generalization Bounds Multi-agent Reinforcement Learning

Paper
Add Code

You May Not Need Ratio Clipping in PPO

no code implementations • 31 Jan 2022 • Mingfei Sun, Vitaly Kurin, Guoqing Liu, Sam Devlin, Tao Qin, Katja Hofmann, Shimon Whiteson

Furthermore, we show that ESPO can be easily scaled up to distributed training with many workers, delivering strong performance as well.

Continuous Control

Paper
Add Code

Birds Eye View Social Distancing Analysis System

no code implementations • 14 Dec 2021 • Zhengye Yang, Mingfei Sun, Hongzhe Ye, Zihao Xiong, Gil Zussman, Zoran Kostic

We propose and evaluate a privacy-preserving social distancing analysis system (B-SDA), which uses bird's-eye view video recordings of pedestrians who cross traffic intersections.

object-detection Object Detection +2

Paper
Add Code

Deterministic and Discriminative Imitation (D2-Imitation): Revisiting Adversarial Imitation for Sample Efficiency

1 code implementation • 11 Dec 2021 • Mingfei Sun, Sam Devlin, Katja Hofmann, Shimon Whiteson

Sample efficiency is crucial for imitation learning methods to be applicable in real-world applications.

Imitation Learning

Paper
Code

SoftDICE for Imitation Learning: Rethinking Off-policy Distribution Matching

no code implementations • 6 Jun 2021 • Mingfei Sun, Anuj Mahajan, Katja Hofmann, Shimon Whiteson

We present SoftDICE, which achieves state-of-the-art performance for imitation learning.

Imitation Learning

Paper
Add Code

Supervised Learning Achieves Human-Level Performance in MOBA Games: A Case Study of Honor of Kings

no code implementations • 25 Nov 2020 • Deheng Ye, Guibin Chen, Peilin Zhao, Fuhao Qiu, Bo Yuan, Wen Zhang, Sheng Chen, Mingfei Sun, Xiaoqian Li, Siqin Li, Jing Liang, Zhenjie Lian, Bei Shi, Liang Wang, Tengfei Shi, Qiang Fu, Wei Yang, Lanxiao Huang

Unlike prior attempts, we integrate the macro-strategy and the micromanagement of MOBA-game-playing into neural networks in a supervised and end-to-end manner.

Paper
Add Code

Is Independent Learning All You Need in the StarCraft Multi-Agent Challenge?

6 code implementations • 18 Nov 2020 • Christian Schroeder de Witt, Tarun Gupta, Denys Makoviichuk, Viktor Makoviychuk, Philip H. S. Torr, Mingfei Sun, Shimon Whiteson

Most recently developed approaches to cooperative multi-agent reinforcement learning in the \emph{centralized training with decentralized execution} setting involve estimating a centralized, joint value function.

reinforcement-learning Reinforcement Learning (RL) +2

164

Paper
Code

Mastering Complex Control in MOBA Games with Deep Reinforcement Learning

no code implementations • 20 Dec 2019 • Deheng Ye, Zhao Liu, Mingfei Sun, Bei Shi, Peilin Zhao, Hao Wu, Hongsheng Yu, Shaojie Yang, Xipeng Wu, Qingwei Guo, Qiaobo Chen, Yinyuting Yin, Hao Zhang, Tengfei Shi, Liang Wang, Qiang Fu, Wei Yang, Lanxiao Huang

We study the reinforcement learning problem of complex action control in the Multi-player Online Battle Arena (MOBA) 1v1 games.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Adversarial Imitation Learning from Incomplete Demonstrations

1 code implementation • 29 May 2019 • Mingfei Sun, Xiaojuan Ma

In this paper, we propose a novel algorithm called Action-Guided Adversarial Imitation Learning (AGAIL) that learns a policy from demonstrations with incomplete action sequences, i. e., incomplete demonstrations.

Imitation Learning

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.