Search Results for author: Rundong Wang

Found 19 papers, 1 papers with code

Inducing Cooperation via Team Regret Minimization based Multi-Agent Deep Reinforcement Learning

no code implementations18 Nov 2019 Runsheng Yu, Zhenyu Shi, Xinrun Wang, Rundong Wang, Buhong Liu, Xinwen Hou, Hanjiang Lai, Bo An

Existing value-factorized based Multi-Agent deep Reinforce-ment Learning (MARL) approaches are well-performing invarious multi-agent cooperative environment under thecen-tralized training and decentralized execution(CTDE) scheme, where all agents are trained together by the centralized valuenetwork and each agent execute its policy independently.

reinforcement-learning Reinforcement Learning (RL)

Learning Expensive Coordination: An Event-Based Deep RL Approach

no code implementations ICLR 2020 Zhenyu Shi*, Runsheng Yu*, Xinrun Wang*, Rundong Wang, Youzhi Zhang, Hanjiang Lai, Bo An

The main difficulties of expensive coordination are that i) the leader has to consider the long-term effect and predict the followers' behaviors when assigning bonuses and ii) the complex interactions between followers make the training process hard to converge, especially when the leader's policy changes with time.

Decision Making Multi-agent Reinforcement Learning

MetaInfoNet: Learning Task-Guided Information for Sample Reweighting

no code implementations9 Dec 2020 Hongxin Wei, Lei Feng, Rundong Wang, Bo An

Deep neural networks have been shown to easily overfit to biased training data with label noise or class imbalance.

Meta-Learning

Deep Stock Trading: A Hierarchical Reinforcement Learning Framework for Portfolio Optimization and Order Execution

no code implementations23 Dec 2020 Rundong Wang, Hongxin Wei, Bo An, Zhouyan Feng, Jun Yao

Portfolio management via reinforcement learning is at the forefront of fintech research, which explores how to optimally reallocate a fund into different financial assets over the long term by trial-and-error.

Hierarchical Reinforcement Learning Management +2

RMIX: Risk-Sensitive Multi-Agent Reinforcement Learning

no code implementations1 Jan 2021 Wei Qiu, Xinrun Wang, Runsheng Yu, Xu He, Rundong Wang, Bo An, Svetlana Obraztsova, Zinovi Rabinovich

Centralized training with decentralized execution (CTDE) has become an important paradigm in multi-agent reinforcement learning (MARL).

Multi-agent Reinforcement Learning reinforcement-learning +3

Safe Coupled Deep Q-Learning for Recommendation Systems

no code implementations8 Jan 2021 Runsheng Yu, Yu Gong, Rundong Wang, Bo An, Qingwen Liu, Wenwu Ou

Firstly, we introduce a novel training scheme with two value functions to maximize the accumulated long-term reward under the safety constraint.

Q-Learning Recommendation Systems +1

RMIX: Learning Risk-Sensitive Policies for Cooperative Reinforcement Learning Agents

no code implementations16 Feb 2021 Wei Qiu, Xinrun Wang, Runsheng Yu, Xu He, Rundong Wang, Bo An, Svetlana Obraztsova, Zinovi Rabinovich

Current value-based multi-agent reinforcement learning methods optimize individual Q values to guide individuals' behaviours via centralized training with decentralized execution (CTDE).

Multi-agent Reinforcement Learning reinforcement-learning +3

Reinforcement Learning for Quantitative Trading

no code implementations28 Sep 2021 Shuo Sun, Rundong Wang, Bo An

RL's impact is pervasive, recently demonstrating its ability to conquer many challenging QT tasks.

Decision Making reinforcement-learning +1

RMIX: Learning Risk-Sensitive Policies forCooperative Reinforcement Learning Agents

no code implementations NeurIPS 2021 Wei Qiu, Xinrun Wang, Runsheng Yu, Rundong Wang, Xu He, Bo An, Svetlana Obraztsova, Zinovi Rabinovich

Current value-based multi-agent reinforcement learning methods optimize individual Q values to guide individuals' behaviours via centralized training with decentralized execution (CTDE).

Multi-agent Reinforcement Learning reinforcement-learning +3

DeepScalper: A Risk-Aware Reinforcement Learning Framework to Capture Fleeting Intraday Trading Opportunities

no code implementations15 Dec 2021 Shuo Sun, Wanqi Xue, Rundong Wang, Xu He, Junlei Zhu, Jian Li, Bo An

Reinforcement learning (RL) techniques have shown great success in many challenging quantitative trading tasks, such as portfolio management and algorithmic trading.

Algorithmic Trading Decision Making +3

Attention over Self-attention:Intention-aware Re-ranking with Dynamic Transformer Encoders for Recommendation

no code implementations14 Jan 2022 Zhuoyi Lin, Sheng Zang, Rundong Wang, Zhu Sun, J. Senthilnath, Chi Xu, Chee-Keong Kwoh

We then introduce a dynamic transformer encoder (DTE) to capture user-specific inter-item relationships among item candidates by seamlessly accommodating the learned latent user intentions via IDM.

Re-Ranking

Towards Skilled Population Curriculum for Multi-Agent Reinforcement Learning

no code implementations7 Feb 2023 Rundong Wang, Longtao Zheng, Wei Qiu, Bowei He, Bo An, Zinovi Rabinovich, Yujing Hu, Yingfeng Chen, Tangjie Lv, Changjie Fan

Despite its success, ACL's applicability is limited by (1) the lack of a general student framework for dealing with the varying number of agents across tasks and the sparse reward problem, and (2) the non-stationarity of the teacher's task due to ever-changing student strategies.

Multi-agent Reinforcement Learning reinforcement-learning +1

Towards Effective and Interpretable Human-Agent Collaboration in MOBA Games: A Communication Perspective

no code implementations23 Apr 2023 Yiming Gao, Feiyu Liu, Liang Wang, Zhenjie Lian, Weixuan Wang, Siqin Li, Xianliang Wang, Xianhan Zeng, Rundong Wang, Jiawei Wang, Qiang Fu, Wei Yang, Lanxiao Huang, Wei Liu

MOBA games, e. g., Dota2 and Honor of Kings, have been actively used as the testbed for the recent AI research on games, and various AI systems have been developed at the human level so far.

Synapse: Trajectory-as-Exemplar Prompting with Memory for Computer Control

1 code implementation13 Jun 2023 Longtao Zheng, Rundong Wang, Xinrun Wang, Bo An

To address these challenges, we introduce Synapse, a computer agent featuring three key components: i) state abstraction, which filters out task-irrelevant information from raw states, allowing more exemplars within the limited context, ii) trajectory-as-exemplar prompting, which prompts the LLM with complete trajectories of the abstracted states and actions to improve multi-step decision-making, and iii) exemplar memory, which stores the embeddings of exemplars and retrieves them via similarity search for generalization to novel tasks.

Decision Making In-Context Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.