Search Results for author: Ying Wen

Found 52 papers, 29 papers with code

TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision

1 code implementation10 Mar 2024 Ruiwen Zhou, Yingxuan Yang, Muning Wen, Ying Wen, Wenhao Wang, Chunling Xi, Guoqiang Xu, Yong Yu, Weinan Zhang

Among these works, many of them utilize in-context examples to achieve generalization without the need for fine-tuning, while few of them have considered the problem of how to select and effectively utilize these examples.

Language Modelling Large Language Model +1

Offline Fictitious Self-Play for Competitive Games

no code implementations29 Feb 2024 Jingxiao Chen, Weiji Xie, Weinan Zhang, Yong Yu, Ying Wen

Firstly, unaware of the game structure, it is impossible to interact with the opponents and conduct a major learning paradigm, self-play, for competitive games.

Offline RL Reinforcement Learning (RL)

DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning

1 code implementation27 Feb 2024 Siyuan Guo, Cheng Deng, Ying Wen, Hechang Chen, Yi Chang, Jun Wang

In the development stage, DS-Agent follows the CBR framework to structure an automatic iteration pipeline, which can flexibly capitalize on the expert knowledge from Kaggle, and facilitate consistent performance improvement through the feedback mechanism.

Code Generation

Aligning Individual and Collective Objectives in Multi-Agent Cooperation

no code implementations19 Feb 2024 Yang Li, WenHao Zhang, Jianhong Wang, Shao Zhang, Yali Du, Ying Wen, Wei Pan

The visualization of learning dynamics effectively demonstrates that AgA successfully achieves alignment between individual and collective objectives.


Natural Language Reinforcement Learning

no code implementations11 Feb 2024 Xidong Feng, Ziyu Wan, Mengyue Yang, Ziyan Wang, Girish A. Koushik, Yali Du, Ying Wen, Jun Wang

Reinforcement Learning (RL) has shown remarkable abilities in learning policies for decision-making tasks.

Decision Making reinforcement-learning +1

Entropy-Regularized Token-Level Policy Optimization for Large Language Models

1 code implementation9 Feb 2024 Muning Wen, Cheng Deng, Jun Wang, Weinan Zhang, Ying Wen

At the heart of ETPO is our novel per-token soft Bellman update, designed to harmonize the RL process with the principles of language modeling.

Code Generation Decision Making +3

Adaptive Control Strategy for Quadruped Robots in Actuator Degradation Scenarios

1 code implementation29 Dec 2023 Xinyuan Wu, Wentao Dong, Hang Lai, Yong Yu, Ying Wen

Quadruped robots have strong adaptability to extreme environments but may also experience faults.

Critic-Guided Decision Transformer for Offline Reinforcement Learning

no code implementations21 Dec 2023 Yuanfu Wang, Chao Yang, Ying Wen, Yu Liu, Yu Qiao

Recent advancements in offline reinforcement learning (RL) have underscored the capabilities of Return-Conditioned Supervised Learning (RCSL), a paradigm that learns the action distribution based on target returns for each state in a supervised manner.

D4RL Offline RL +3

Controlling Large Language Model-based Agents for Large-Scale Decision-Making: An Actor-Critic Approach

no code implementations23 Nov 2023 Bin Zhang, Hangyu Mao, Jingqing Ruan, Ying Wen, Yang Li, Shao Zhang, Zhiwei Xu, Dapeng Li, Ziyue Li, Rui Zhao, Lijuan Li, Guoliang Fan

The remarkable progress in Large Language Models (LLMs) opens up new avenues for addressing planning and decision-making problems in Multi-Agent Systems (MAS).

Decision Making Hallucination +3

GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models

1 code implementation8 Oct 2023 Hanjing Wang, Man-Kit Sit, Congjie He, Ying Wen, Weinan Zhang, Jun Wang, Yaodong Yang, Luo Mai

This paper introduces a distributed, GPU-centric experience replay system, GEAR, designed to perform scalable reinforcement learning (RL) with large sequence models (such as transformers).

Reinforcement Learning (RL)

Quantifying Zero-shot Coordination Capability with Behavior Preferring Partners

no code implementations8 Oct 2023 Xihuai Wang, Shao Zhang, WenHao Zhang, Wentao Dong, Jingxiao Chen, Ying Wen, Weinan Zhang

Current evaluation methods for ZSC capability still need to improve in constructing diverse evaluation partners and comprehensively measuring the ZSC capability.

Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training

1 code implementation29 Sep 2023 Xidong Feng, Ziyu Wan, Muning Wen, Stephen Marcus McAleer, Ying Wen, Weinan Zhang, Jun Wang

Empirical results across reasoning, planning, alignment, and decision-making tasks show that TS-LLM outperforms existing approaches and can handle trees with a depth of 64.

Decision Making Language Modelling +1

Cross-Utterance Conditioned VAE for Speech Generation

no code implementations8 Sep 2023 Yang Li, Cheng Yu, Guangzhi Sun, Weiqin Zu, Zheng Tian, Ying Wen, Wei Pan, Chao Zhang, Jun Wang, Yang Yang, Fanglei Sun

Experimental results on the LibriTTS datasets demonstrate that our proposed models significantly enhance speech synthesis and editing, producing more natural and expressive speech.

Speech Synthesis

Large Sequence Models for Sequential Decision-Making: A Survey

no code implementations24 Jun 2023 Muning Wen, Runji Lin, Hanjing Wang, Yaodong Yang, Ying Wen, Luo Mai, Jun Wang, Haifeng Zhang, Weinan Zhang

Transformer architectures have facilitated the development of large-scale and general-purpose sequence models for prediction tasks in natural language processing and computer vision, e. g., GPT-3 and Swin Transformer.

Decision Making

Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination

1 code implementation5 Jun 2023 Yang Li, Shao Zhang, Jichen Sun, WenHao Zhang, Yali Du, Ying Wen, Xinbing Wang, Wei Pan

In order to solve cooperative incompatibility in learning and effectively address the problem in the context of ZSC, we introduce the Cooperative Open-ended LEarning (COLE) framework, which formulates open-ended objectives in cooperative games with two players using perspectives of graph theory to evaluate and pinpoint the cooperative capacity of each strategy.

Order Matters: Agent-by-agent Policy Optimization

no code implementations13 Feb 2023 Xihuai Wang, Zheng Tian, Ziyu Wan, Ying Wen, Jun Wang, Weinan Zhang

In this paper, we propose the \textbf{A}gent-by-\textbf{a}gent \textbf{P}olicy \textbf{O}ptimization (A2PO) algorithm to improve the sample efficiency and retain the guarantees of monotonic improvement for each agent during training.

Cooperative Open-ended Learning Framework for Zero-shot Coordination

1 code implementation9 Feb 2023 Yang Li, Shao Zhang, Jichen Sun, Yali Du, Ying Wen, Xinbing Wang, Wei Pan

However, these approaches can result in a loss of learning and an inability to cooperate with certain strategies within the population, known as cooperative incompatibility.

SCConv: Spatial and Channel Reconstruction Convolution for Feature Redundancy

1 code implementation CVPR 2023 Jiafeng Li, Ying Wen, Lianghua He

The proposed SCConv consists of two units: spatial reconstruction unit (SRU) and channel reconstruction unit (CRU).

On Realization of Intelligent Decision-Making in the Real World: A Foundation Decision Model Perspective

1 code implementation24 Dec 2022 Ying Wen, Ziyu Wan, Ming Zhou, Shufang Hou, Zhe Cao, Chenyang Le, Jingxiao Chen, Zheng Tian, Weinan Zhang, Jun Wang

The pervasive uncertainty and dynamic nature of real-world environments present significant challenges for the widespread implementation of machine-driven Intelligent Decision-Making (IDM) systems.

Decision Making Image Captioning +2

KnowledgeShovel: An AI-in-the-Loop Document Annotation System for Scientific Knowledge Base Construction

1 code implementation6 Oct 2022 Shao Zhang, Yuting Jia, Hui Xu, Dakuo Wang, Toby Jia-Jun Li, Ying Wen, Xinbing Wang, Chenghu Zhou

Constructing a comprehensive, accurate, and useful scientific knowledge base is crucial for human researchers synthesizing scientific knowledge and for enabling Al-driven scientific discovery.

Multi-Agent Reinforcement Learning is a Sequence Modeling Problem

1 code implementation30 May 2022 Muning Wen, Jakub Grudzien Kuba, Runji Lin, Weinan Zhang, Ying Wen, Jun Wang, Yaodong Yang

In this paper, we introduce a novel architecture named Multi-Agent Transformer (MAT) that effectively casts cooperative multi-agent reinforcement learning (MARL) into SM problems wherein the task is to map agents' observation sequence to agents' optimal action sequence.

Decision Making Multi-agent Reinforcement Learning +2

Multi-Agent Feedback Enabled Neural Networks for Intelligent Communications

1 code implementation22 May 2022 Fanglei Sun, Yang Li, Ying Wen, Jingchen Hu, Jun Wang, Yang Yang, Kai Li

The design of MAFENN framework and algorithm are dedicated to enhance the learning capability of the feedfoward DL networks or their variations with the simple data feedback.

Denoising Intelligent Communication

Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech

1 code implementation ACL 2022 Yang Li, Cheng Yu, Guangzhi Sun, Hua Jiang, Fanglei Sun, Weiqin Zu, Ying Wen, Yang Yang, Jun Wang

Modelling prosody variation is critical for synthesizing natural and expressive speech in end-to-end text-to-speech (TTS) systems.

DeepShovel: An Online Collaborative Platform for Data Extraction in Geoscience Literature with AI Assistance

no code implementations21 Feb 2022 Shao Zhang, Yuting Jia, Hui Xu, Ying Wen, Dakuo Wang, Xinbing Wang

Geoscientists, as well as researchers in many fields, need to read a huge amount of literature to locate, extract, and aggregate relevant results and data to enable future research or to build a scientific database, but there is no existing system to support this use case well.

Efficient Policy Space Response Oracles

no code implementations28 Jan 2022 Ming Zhou, Jingxiao Chen, Ying Wen, Weinan Zhang, Yaodong Yang, Yong Yu, Jun Wang

Policy Space Response Oracle methods (PSRO) provide a general solution to learn Nash equilibrium in two-player zero-sum games but suffer from two drawbacks: (1) the computation inefficiency due to the need for consistent meta-game evaluation via simulations, and (2) the exploration inefficiency due to finding the best response against a fixed meta-strategy at every epoch.

Efficient Exploration

Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks

1 code implementation6 Dec 2021 Linghui Meng, Muning Wen, Yaodong Yang, Chenyang Le, Xiyun Li, Weinan Zhang, Ying Wen, Haifeng Zhang, Jun Wang, Bo Xu

In this paper, we facilitate the research by providing large-scale datasets, and use them to examine the usage of the Decision Transformer in the context of MARL.

Offline RL reinforcement-learning +4

Neural Auto-Curricula in Two-Player Zero-Sum Games

1 code implementation NeurIPS 2021 Xidong Feng, Oliver Slumbers, Ziyu Wan, Bo Liu, Stephen Mcaleer, Ying Wen, Jun Wang, Yaodong Yang

When solving two-player zero-sum games, multi-agent reinforcement learning (MARL) algorithms often create populations of agents where, at each iteration, a new agent is discovered as the best response to a mixture over the opponent population.

Multi-agent Reinforcement Learning Vocal Bursts Valence Prediction

Towards Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games

1 code implementation NeurIPS 2021 Xiangyu Liu, Hangtian Jia, Ying Wen, Yaodong Yang, Yujing Hu, Yingfeng Chen, Changjie Fan, Zhipeng Hu

With this unified diversity measure, we design the corresponding diversity-promoting objective and population effectivity when seeking the best responses in open-ended learning.

A Game-Theoretic Approach to Multi-Agent Trust Region Optimization

1 code implementation12 Jun 2021 Ying Wen, Hui Chen, Yaodong Yang, Zheng Tian, Minne Li, Xu Chen, Jun Wang

Trust region methods are widely applied in single-agent reinforcement learning problems due to their monotonic performance-improvement guarantee at every iteration.

Atari Games Multi-agent Reinforcement Learning +2

Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games

no code implementations9 Jun 2021 Xiangyu Liu, Hangtian Jia, Ying Wen, Yaodong Yang, Yujing Hu, Yingfeng Chen, Changjie Fan, Zhipeng Hu

With this unified diversity measure, we design the corresponding diversity-promoting objective and population effectivity when seeking the best responses in open-ended learning.

MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning

1 code implementation5 Jun 2021 Ming Zhou, Ziyu Wan, Hanjing Wang, Muning Wen, Runzhe Wu, Ying Wen, Yaodong Yang, Weinan Zhang, Jun Wang

Our framework is comprised of three key components: (1) a centralized task dispatching model, which supports the self-generated tasks and scalable training with heterogeneous policy combinations; (2) a programming architecture named Actor-Evaluator-Learner, which achieves high parallelism for both training and sampling, and meets the evaluation requirement of auto-curriculum learning; (3) a higher-level abstraction of MARL training paradigms, which enables efficient code reuse and flexible deployments on different distributed computing paradigms.

Atari Games Distributed Computing +3

Neural Auto-Curricula

1 code implementation4 Jun 2021 Xidong Feng, Oliver Slumbers, Ziyu Wan, Bo Liu, Stephen Mcaleer, Ying Wen, Jun Wang, Yaodong Yang

When solving two-player zero-sum games, multi-agent reinforcement learning (MARL) algorithms often create populations of agents where, at each iteration, a new agent is discovered as the best response to a mixture over the opponent population.

Multi-agent Reinforcement Learning

Modelling Behavioural Diversity for Learning in Open-Ended Games

3 code implementations14 Mar 2021 Nicolas Perez Nieves, Yaodong Yang, Oliver Slumbers, David Henry Mguni, Ying Wen, Jun Wang

Promoting behavioural diversity is critical for solving games with non-transitive dynamics where strategic cycles exist, and there is no consistent winner (e. g., Rock-Paper-Scissors).

Point Processes

Multi-Agent Trust Region Learning

1 code implementation1 Jan 2021 Ying Wen, Hui Chen, Yaodong Yang, Zheng Tian, Minne Li, Xu Chen, Jun Wang

We derive the lower bound of agents' payoff improvements for MATRL methods, and also prove the convergence of our method on the meta-game fixed points.

Atari Games Multi-agent Reinforcement Learning +3

Convolutional Neural Network optimization via Channel Reassessment Attention module

no code implementations12 Oct 2020 YuTao Shen, Ying Wen

The performance of convolutional neural networks (CNNs) can be improved by adjusting the interrelationship between channels with attention mechanism.

Multi-Agent Determinantal Q-Learning

1 code implementation ICML 2020 Yaodong Yang, Ying Wen, Li-Heng Chen, Jun Wang, Kun Shao, David Mguni, Wei-Nan Zhang

Though practical, current methods rely on restrictive assumptions to decompose the centralized value function across agents for execution.


Segmenting Medical MRI via Recurrent Decoding Cell

1 code implementation21 Nov 2019 Ying Wen, Kai Xie, Lianghua He

The encoder-decoder networks are commonly used in medical image segmentation due to their remarkable performance in hierarchical feature fusion.

Image Segmentation Medical Image Segmentation +1

A Regularized Opponent Model with Maximum Entropy Objective

1 code implementation17 May 2019 Zheng Tian, Ying Wen, Zhichen Gong, Faiz Punakkath, Shihao Zou, Jun Wang

In a single-agent setting, reinforcement learning (RL) tasks can be cast into an inference problem by introducing a binary random variable o, which stands for the "optimality".

Multi-agent Reinforcement Learning reinforcement-learning +1

Joint Perception and Control as Inference with an Object-based Implementation

no code implementations4 Mar 2019 Minne Li, Zheng Tian, Pranav Nashikkar, Ian Davies, Ying Wen, Jun Wang

Existing model-based reinforcement learning methods often study perception modeling and decision making separately.

Bayesian Inference Decision Making +2

Modelling Bounded Rationality in Multi-Agent Interactions by Generalized Recursive Reasoning

no code implementations26 Jan 2019 Ying Wen, Yaodong Yang, Rui Luo, Jun Wang

Though limited in real-world decision making, most multi-agent reinforcement learning (MARL) models assume perfectly rational agents -- a property hardly met due to individual's cognitive limitation and/or the tractability of the decision problem.

Decision Making Multi-agent Reinforcement Learning

Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning

no code implementations ICLR 2019 Ying Wen, Yaodong Yang, Rui Luo, Jun Wang, Wei Pan

Our methods are tested on both the matrix game and the differential game, which have a non-trivial equilibrium where common gradient-based methods fail to converge.

Multi-agent Reinforcement Learning reinforcement-learning +1

Factorized Q-Learning for Large-Scale Multi-Agent Systems

no code implementations11 Sep 2018 Yong Chen, Ming Zhou, Ying Wen, Yaodong Yang, Yufeng Su, Wei-Nan Zhang, Dell Zhang, Jun Wang, Han Liu

Deep Q-learning has achieved a significant success in single-agent decision making tasks.

Multiagent Systems

A Study of AI Population Dynamics with Million-agent Reinforcement Learning

no code implementations13 Sep 2017 Yaodong Yang, Lantao Yu, Yiwei Bai, Jun Wang, Wei-Nan Zhang, Ying Wen, Yong Yu

We conduct an empirical study on discovering the ordered collective dynamics obtained by a population of intelligence agents, driven by million-agent reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Learning to Design Games: Strategic Environments in Reinforcement Learning

no code implementations5 Jul 2017 Haifeng Zhang, Jun Wang, Zhiming Zhou, Wei-Nan Zhang, Ying Wen, Yong Yu, Wenxin Li

In typical reinforcement learning (RL), the environment is assumed given and the goal of the learning is to identify an optimal policy for the agent taking actions through its interactions with the environment.

reinforcement-learning Reinforcement Learning (RL)

Product-based Neural Networks for User Response Prediction

11 code implementations1 Nov 2016 Yanru Qu, Han Cai, Kan Ren, Wei-Nan Zhang, Yong Yu, Ying Wen, Jun Wang

Predicting user responses, such as clicks and conversions, is of great importance and has found its usage in many Web applications including recommender systems, web search and online advertising.

Click-Through Rate Prediction Recommendation Systems

Learning text representation using recurrent convolutional neural network with highway layers

no code implementations22 Jun 2016 Ying Wen, Wei-Nan Zhang, Rui Luo, Jun Wang

Recently, the rapid development of word embedding and neural networks has brought new inspiration to various NLP and IR tasks.

Sentiment Analysis

Cannot find the paper you are looking for? You can Submit a new open access paper.