Search Results for author: Deheng Ye

Found 15 papers, 4 papers with code

Quantized Adaptive Subgradient Algorithms and Their Applications

no code implementations11 Aug 2022 Ke Xu, Jianqiao Wangni, Yifan Zhang, Deheng Ye, Jiaxiang Wu, Peilin Zhao

Therefore, a threshold quantization strategy with a relatively small error is adopted in QCMD adagrad and QRDA adagrad to improve the signal-to-noise ratio and preserve the sparsity of the model.

Quantization

GPN: A Joint Structural Learning Framework for Graph Neural Networks

no code implementations12 May 2022 Qianggang Ding, Deheng Ye, Tingyang Xu, Peilin Zhao

To the best of our knowledge, our method is the first GNN-based bilevel optimization framework for resolving this task.

Bilevel Optimization

MineRL Diamond 2021 Competition: Overview, Results, and Lessons Learned

no code implementations17 Feb 2022 Anssi Kanervisto, Stephanie Milani, Karolis Ramanauskas, Nicholay Topin, Zichuan Lin, Junyou Li, Jianing Shi, Deheng Ye, Qiang Fu, Wei Yang, Weijun Hong, Zhongyue Huang, Haicheng Chen, Guangjun Zeng, Yue Lin, Vincent Micheli, Eloi Alonso, François Fleuret, Alexander Nikulin, Yury Belousov, Oleg Svidchenko, Aleksei Shpilman

With this in mind, we hosted the third edition of the MineRL ObtainDiamond competition, MineRL Diamond 2021, with a separate track in which we permitted any solution to promote the participation of newcomers.

JueWu-MC: Playing Minecraft with Sample-efficient Hierarchical Reinforcement Learning

no code implementations7 Dec 2021 Zichuan Lin, Junyou Li, Jianing Shi, Deheng Ye, Qiang Fu, Wei Yang

To address this, we propose JueWu-MC, a sample-efficient hierarchical RL approach equipped with representation learning and imitation learning to deal with perception and exploration.

Efficient Exploration Hierarchical Reinforcement Learning +3

Coordinated Proximal Policy Optimization

1 code implementation NeurIPS 2021 Zifan Wu, Chao Yu, Deheng Ye, Junge Zhang, Haiyin Piao, Hankz Hankui Zhuo

We present Coordinated Proximal Policy Optimization (CoPPO), an algorithm that extends the original Proximal Policy Optimization (PPO) to the multi-agent setting.

Starcraft Starcraft II

Learning Diverse Policies in MOBA Games via Macro-Goals

no code implementations NeurIPS 2021 Yiming Gao, Bei Shi, Xueying Du, Liang Wang, Guangwei Chen, Zhenjie Lian, Fuhao Qiu, Guoan Han, Weixuan Wang, Deheng Ye, Qiang Fu, Wei Yang, Lanxiao Huang

Recently, many researchers have made successful progress in building the AI systems for MOBA-game-playing with deep reinforcement learning, such as on Dota 2 and Honor of Kings.

Dota 2

TiKick: Towards Playing Multi-agent Football Full Games from Single-agent Demonstrations

1 code implementation9 Oct 2021 Shiyu Huang, Wenze Chen, Longfei Zhang, Shizhen Xu, Ziyang Li, Fengming Zhu, Deheng Ye, Ting Chen, Jun Zhu

To the best of our knowledge, Tikick is the first learning-based AI system that can take over the multi-agent Google Research Football full game, while previous work could either control a single agent or experiment on toy academic scenarios.

Starcraft Starcraft II

Boosting Offline Reinforcement Learning with Residual Generative Modeling

no code implementations19 Jun 2021 Hua Wei, Deheng Ye, Zhao Liu, Hao Wu, Bo Yuan, Qiang Fu, Wei Yang, Zhenhui Li

While most research focuses on the state-action function part through reducing the bootstrapping error in value function approximation induced by the distribution shift of training data, the effects of error propagation in generative modeling have been neglected.

Offline RL Q-Learning +1

MapGo: Model-Assisted Policy Optimization for Goal-Oriented Tasks

1 code implementation13 May 2021 Menghui Zhu, Minghuan Liu, Jian Shen, Zhicheng Zhang, Sheng Chen, Weinan Zhang, Deheng Ye, Yong Yu, Qiang Fu, Wei Yang

In Goal-oriented Reinforcement learning, relabeling the raw goals in past experience to provide agents with hindsight ability is a major solution to the reward sparsity problem.

reinforcement-learning

Generating Informative CVE Description From ExploitDB Posts by Extractive Summarization

no code implementations5 Jan 2021 Jiamou Sun, Zhenchang Xing, Hao Guo, Deheng Ye, Xiaohong Li, Xiwei Xu, Liming Zhu

The extracted aspects from an ExploitDB post are then composed into a CVE description according to the suggested CVE description templates, which is must-provided information for requesting new CVEs.

Extractive Summarization Text Summarization

Which Heroes to Pick? Learning to Draft in MOBA Games with Neural Networks and Tree Search

no code implementations18 Dec 2020 Sheng Chen, Menghui Zhu, Deheng Ye, Weinan Zhang, Qiang Fu, Wei Yang

Hero drafting is essential in MOBA game playing as it builds the team of each side and directly affects the match outcome.

Towards Playing Full MOBA Games with Deep Reinforcement Learning

no code implementations NeurIPS 2020 Deheng Ye, Guibin Chen, Wen Zhang, Sheng Chen, Bo Yuan, Bo Liu, Jia Chen, Zhao Liu, Fuhao Qiu, Hongsheng Yu, Yinyuting Yin, Bei Shi, Liang Wang, Tengfei Shi, Qiang Fu, Wei Yang, Lanxiao Huang, Wei Liu

However, existing work falls short in handling the raw game complexity caused by the explosion of agent combinations, i. e., lineups, when expanding the hero pool in case that OpenAI's Dota AI limits the play to a pool of only 17 heroes.

Dota 2 League of Legends +1

Supervised Learning Achieves Human-Level Performance in MOBA Games: A Case Study of Honor of Kings

no code implementations25 Nov 2020 Deheng Ye, Guibin Chen, Peilin Zhao, Fuhao Qiu, Bo Yuan, Wen Zhang, Sheng Chen, Mingfei Sun, Xiaoqian Li, Siqin Li, Jing Liang, Zhenjie Lian, Bei Shi, Liang Wang, Tengfei Shi, Qiang Fu, Wei Yang, Lanxiao Huang

Unlike prior attempts, we integrate the macro-strategy and the micromanagement of MOBA-game-playing into neural networks in a supervised and end-to-end manner.

Relation-Aware Transformer for Portfolio Policy Learning

2 code implementations IJCAI 2020 Ke Xu, Yifan Zhang, Deheng Ye, Peilin Zhao, Mingkui Tan

One of the key issues is how to represent the non-stationary price series of assets in a portfolio, which is important for portfolio decisions.

Cannot find the paper you are looking for? You can Submit a new open access paper.