Search Results for author: Yali Du

Found 41 papers, 20 papers with code

All Language Models Large and Small

no code implementations19 Feb 2024 Zhixun Chen, Yali Du, David Mguni

Theoretically, we prove LONDI learns the subset of system states to activate the LLM required to solve the task.

Decision Making

Aligning Individual and Collective Objectives in Multi-Agent Cooperation

no code implementations19 Feb 2024 Yang Li, WenHao Zhang, Jianhong Wang, Shao Zhang, Yali Du, Ying Wen, Wei Pan

The visualization of learning dynamics effectively demonstrates that AgA successfully achieves alignment between individual and collective objectives.

SMAC+

Natural Language Reinforcement Learning

no code implementations11 Feb 2024 Xidong Feng, Ziyu Wan, Mengyue Yang, Ziyan Wang, Girish A. Koushik, Yali Du, Ying Wen, Jun Wang

Reinforcement Learning (RL) has shown remarkable abilities in learning policies for decision-making tasks.

Decision Making reinforcement-learning +1

Learning the Expected Core of Strictly Convex Stochastic Cooperative Games

no code implementations10 Feb 2024 Nam Phuong Tran, The Anh Ta, Shuqing Shi, Debmalya Mandal, Yali Du, Long Tran-Thanh

Reward allocation, also known as the credit assignment problem, has been an important topic in economics, engineering, and machine learning.

Safe Reinforcement Learning with Free-form Natural Language Constraints and Pre-Trained Language Models

no code implementations15 Jan 2024 Xingzhou Lou, Junge Zhang, Ziyan Wang, Kaiqi Huang, Yali Du

Through the use of pre-trained LMs and the elimination of the need for a ground-truth cost, our method enhances safe policy learning under a diverse set of human-derived free-form natural language constraints.

Reinforcement Learning (RL) Safe Reinforcement Learning

Cooperation on the Fly: Exploring Language Agents for Ad Hoc Teamwork in the Avalon Game

no code implementations29 Dec 2023 Zijing Shi, Meng Fang, Shunfeng Zheng, Shilong Deng, Ling Chen, Yali Du

This problem motivates the area of ad hoc teamwork, in which an agent may potentially cooperate with a variety of teammates to achieve a shared goal.

TAPE: Leveraging Agent Topology for Cooperative Multi-Agent Policy Gradient

1 code implementation25 Dec 2023 Xingzhou Lou, Junge Zhang, Timothy J. Norman, Kaiqi Huang, Yali Du

We propose Topology-based multi-Agent Policy gradiEnt (TAPE) for both stochastic and deterministic MAPG methods.

A Review of Cooperation in Multi-agent Learning

no code implementations8 Dec 2023 Yali Du, Joel Z. Leibo, Usman Islam, Richard Willis, Peter Sunehag

Cooperation in multi-agent learning (MAL) is a topic at the intersection of numerous disciplines, including game theory, economics, social sciences, and evolutionary biology.

Decision Making

MACCA: Offline Multi-agent Reinforcement Learning with Causal Credit Assignment

no code implementations6 Dec 2023 Ziyan Wang, Yali Du, Yudi Zhang, Meng Fang, Biwei Huang

Offline Multi-agent Reinforcement Learning (MARL) is valuable in scenarios where online interaction is impractical or risky.

Multi-agent Reinforcement Learning reinforcement-learning

Reduced Policy Optimization for Continuous Control with Hard Constraints

1 code implementation NeurIPS 2023 Shutong Ding, Jingya Wang, Yali Du, Ye Shi

To the best of our knowledge, RPO is the first attempt that introduces GRG to RL as a way of efficiently handling both equality and inequality hard constraints.

Continuous Control Reinforcement Learning (RL)

Invariant Learning via Probability of Sufficient and Necessary Causes

1 code implementation NeurIPS 2023 Mengyue Yang, Zhen Fang, Yonggang Zhang, Yali Du, Furui Liu, Jean-Francois Ton, Jianhong Wang, Jun Wang

To capture the information of sufficient and necessary causes, we employ a classical concept, the probability of sufficiency and necessary causes (PNS), which indicates the probability of whether one is the necessary and sufficient cause.

Replace Scoring with Arrangement: A Contextual Set-to-Arrangement Framework for Learning-to-Rank

no code implementations5 Aug 2023 Jiarui Jin, Xianyu Chen, Weinan Zhang, Mengyue Yang, Yang Wang, Yali Du, Yong Yu, Jun Wang

Notice that these ranking metrics do not consider the effects of the contextual dependence among the items in the list, we design a new family of simulation-based ranking metrics, where existing metrics can be regarded as special cases.

Learning-To-Rank

ChessGPT: Bridging Policy Learning and Language Modeling

1 code implementation NeurIPS 2023 Xidong Feng, Yicheng Luo, Ziyan Wang, Hongrui Tang, Mengyue Yang, Kun Shao, David Mguni, Yali Du, Jun Wang

Thus, we propose ChessGPT, a GPT model bridging policy learning and language modeling by integrating data from these two sources in Chess games.

Decision Making Language Modelling

Zero-shot Preference Learning for Offline RL via Optimal Transport

no code implementations6 Jun 2023 Runze Liu, Yali Du, Fengshuo Bai, Jiafei Lyu, Xiu Li

In this paper, we propose a novel zero-shot preference-based RL algorithm that leverages labeled preference data from source tasks to infer labels for target tasks, eliminating the requirement for human queries.

Offline RL

Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination

1 code implementation5 Jun 2023 Yang Li, Shao Zhang, Jichen Sun, WenHao Zhang, Yali Du, Ying Wen, Xinbing Wang, Wei Pan

In order to solve cooperative incompatibility in learning and effectively address the problem in the context of ZSC, we introduce the Cooperative Open-ended LEarning (COLE) framework, which formulates open-ended objectives in cooperative games with two players using perspectives of graph theory to evaluate and pinpoint the cooperative capacity of each strategy.

Interpretable Reward Redistribution in Reinforcement Learning: A Causal Approach

no code implementations NeurIPS 2023 Yudi Zhang, Yali Du, Biwei Huang, Ziyan Wang, Jun Wang, Meng Fang, Mykola Pechenizkiy

While the majority of current approaches construct the reward redistribution in an uninterpretable manner, we propose to explicitly model the contributions of state and action from a causal perspective, resulting in an interpretable reward redistribution and preserving policy invariance.

reinforcement-learning

Introspective Tips: Large Language Model for In-Context Decision Making

no code implementations19 May 2023 Liting Chen, Lu Wang, Hang Dong, Yali Du, Jie Yan, Fangkai Yang, Shuang Li, Pu Zhao, Si Qin, Saravan Rajmohan, QIngwei Lin, Dongmei Zhang

The emergence of large language models (LLMs) has substantially influenced natural language processing, demonstrating exceptional results across various tasks.

Decision Making Language Modelling +2

STAS: Spatial-Temporal Return Decomposition for Multi-agent Reinforcement Learning

1 code implementation15 Apr 2023 Sirui Chen, Zhaowei Zhang, Yaodong Yang, Yali Du

It first decomposes the global return back to each time step, then utilizes the Shapley Value to redistribute the individual payoff from the decomposed global reward.

Multi-agent Reinforcement Learning reinforcement-learning

A Human-Centered Safe Robot Reinforcement Learning Framework with Interactive Behaviors

no code implementations25 Feb 2023 Shangding Gu, Alap Kshirsagar, Yali Du, Guang Chen, Jan Peters, Alois Knoll

Deployment of Reinforcement Learning (RL) algorithms for robotics applications in the real world requires ensuring the safety of the robot and its environment.

reinforcement-learning Reinforcement Learning (RL) +1

Cooperative Open-ended Learning Framework for Zero-shot Coordination

1 code implementation9 Feb 2023 Yang Li, Shao Zhang, Jichen Sun, Yali Du, Ying Wen, Xinbing Wang, Wei Pan

However, these approaches can result in a loss of learning and an inability to cooperate with certain strategies within the population, known as cooperative incompatibility.

Ensemble Value Functions for Efficient Exploration in Multi-Agent Reinforcement Learning

no code implementations7 Feb 2023 Lukas Schäfer, Oliver Slumbers, Stephen Mcaleer, Yali Du, Stefano V. Albrecht, David Mguni

EMAX trains ensembles of value functions for each agent to address the key challenges of exploration and non-stationarity: (1) The uncertainty of value estimates across the ensemble is used in a UCB policy to guide the exploration of agents to parts of the environment which require cooperation.

Efficient Exploration Multi-agent Reinforcement Learning +2

PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI Coordination

1 code implementation16 Jan 2023 Xingzhou Lou, Jiaxian Guo, Junge Zhang, Jun Wang, Kaiqi Huang, Yali Du

We conduct experiments on the Overcooked environment, and evaluate the zero-shot human-AI coordination performance of our method with both behavior-cloned human proxies and real humans.

Multi-queue Momentum Contrast for Microvideo-Product Retrieval

1 code implementation22 Dec 2022 Yali Du, Yinwei Wei, Wei Ji, Fan Liu, Xin Luo, Liqiang Nie

The booming development and huge market of micro-videos bring new e-commerce channels for merchants.

Representation Learning Retrieval

Contextual Transformer for Offline Meta Reinforcement Learning

no code implementations15 Nov 2022 Runji Lin, Ye Li, Xidong Feng, Zhaowei Zhang, Xian Hong Wu Fung, Haifeng Zhang, Jun Wang, Yali Du, Yaodong Yang

Firstly, we propose prompt tuning for offline RL, where a context vector sequence is concatenated with the input to guide the conditional policy generation.

D4RL Meta Reinforcement Learning +4

Scalable Model-based Policy Optimization for Decentralized Networked Systems

2 code implementations13 Jul 2022 Yali Du, Chengdong Ma, Yuchen Liu, Runji Lin, Hao Dong, Jun Wang, Yaodong Yang

Reinforcement learning algorithms require a large amount of samples; this often limits their real-world applications on even simple tasks.

A Review of Safe Reinforcement Learning: Methods, Theory and Applications

1 code implementation20 May 2022 Shangding Gu, Long Yang, Yali Du, Guang Chen, Florian Walter, Jun Wang, Yaodong Yang, Alois Knoll

To establish a good foundation for future research in this thread, in this paper, we provide a review for safe RL from the perspectives of methods, theory and applications.

Autonomous Driving Decision Making +3

Rethinking Goal-conditioned Supervised Learning and Its Connection to Offline RL

1 code implementation ICLR 2022 Rui Yang, Yiming Lu, Wenzhe Li, Hao Sun, Meng Fang, Yali Du, Xiu Li, Lei Han, Chongjie Zhang

In this paper, we revisit the theoretical property of GCSL -- optimizing a lower bound of the goal reaching objective, and extend GCSL as a novel offline goal-conditioned RL algorithm.

Offline RL Reinforcement Learning (RL) +1

Learning to Identify Top Elo Ratings: A Dueling Bandits Approach

1 code implementation12 Jan 2022 Xue Yan, Yali Du, Binxin Ru, Jun Wang, Haifeng Zhang, Xu Chen

The Elo rating system is widely adopted to evaluate the skills of (chess) game and sports players.

Scheduling

Goal Randomization for Playing Text-based Games without a Reward Function

no code implementations29 Sep 2021 Meng Fang, Yunqiu Xu, Yali Du, Ling Chen, Chengqi Zhang

In a variety of text-based games, we show that this simple method results in competitive performance for agents.

Decision Making text-based games

Generalization in Text-based Games via Hierarchical Reinforcement Learning

1 code implementation Findings (EMNLP) 2021 Yunqiu Xu, Meng Fang, Ling Chen, Yali Du, Chengqi Zhang

Deep reinforcement learning provides a promising approach for text-based games in studying natural language communication between humans and artificial agents.

Hierarchical Reinforcement Learning reinforcement-learning +2

Is Nash Equilibrium Approximator Learnable?

no code implementations17 Aug 2021 Zhijian Duan, Wenhan Huang, Dinghuai Zhang, Yali Du, Jun Wang, Yaodong Yang, Xiaotie Deng

In this paper, we investigate the learnability of the function approximator that approximates Nash equilibrium (NE) for games generated from a distribution.

BIG-bench Machine Learning Meta-Learning +1

MHER: Model-based Hindsight Experience Replay

no code implementations1 Jul 2021 Rui Yang, Meng Fang, Lei Han, Yali Du, Feng Luo, Xiu Li

Replacing original goals with virtual goals generated from interaction with a trained dynamics model leads to a novel relabeling method, model-based relabeling (MBR).

Multi-Goal Reinforcement Learning reinforcement-learning +1

Ordering-Based Causal Discovery with Reinforcement Learning

1 code implementation14 May 2021 Xiaoqiang Wang, Yali Du, Shengyu Zhu, Liangjun Ke, Zhitang Chen, Jianye Hao, Jun Wang

It is a long-standing question to discover causal relations among a set of variables in many empirical sciences.

Causal Discovery reinforcement-learning +2

Learning Predictive Communication by Imagination in Networked System Control

no code implementations1 Jan 2021 Yali Du, Yifan Zhao, Meng Fang, Jun Wang, Gangyan Xu, Haifeng Zhang

Dealing with multi-agent control in networked systems is one of the biggest challenges in Reinforcement Learning (RL) and limited success has been presented compared to recent deep reinforcement learning in single-agent domain.

reinforcement-learning Reinforcement Learning (RL)

Curriculum-guided Hindsight Experience Replay

1 code implementation NeurIPS 2019 Meng Fang, Tianyi Zhou, Yali Du, Lei Han, Zhengyou Zhang

This ``Goal-and-Curiosity-driven Curriculum Learning'' leads to ``Curriculum-guided HER (CHER)'', which adaptively and dynamically controls the exploration-exploitation trade-off during the learning process via hindsight experience selection.

LIIR: Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning

1 code implementation NeurIPS 2019 Yali Du, Lei Han, Meng Fang, Ji Liu, Tianhong Dai, DaCheng Tao

A great challenge in cooperative decentralized multi-agent reinforcement learning (MARL) is generating diversified behaviors for each individual agent when receiving only a team reward.

Multi-agent Reinforcement Learning reinforcement-learning +3

Signal Instructed Coordination in Cooperative Multi-agent Reinforcement Learning

no code implementations10 Sep 2019 Liheng Chen, Hongyi Guo, Yali Du, Fei Fang, Haifeng Zhang, Yaoming Zhu, Ming Zhou, Wei-Nan Zhang, Qing Wang, Yong Yu

Although existing works formulate this problem into a centralized learning with decentralized execution framework, which avoids the non-stationary problem in training, their decentralized execution paradigm limits the agents' capability to coordinate.

Multi-agent Reinforcement Learning reinforcement-learning +1

Towards Query Efficient Black-box Attacks: An Input-free Perspective

1 code implementation9 Sep 2018 Yali Du, Meng Fang, Jin-Feng Yi, Jun Cheng, DaCheng Tao

First, we initialize an adversarial example with a gray color image on which every pixel has roughly the same importance for the target model.

Cannot find the paper you are looking for? You can Submit a new open access paper.