Search Results for author: Ying Wen

Found 74 papers, 40 papers with code

A Survey of AI Agent Protocols

no code implementations23 Apr 2025 Yingxuan Yang, Huacan Chai, Yuanyi Song, Siyuan Qi, Muning Wen, Ning li, Junwei Liao, Haoyi Hu, Jianghao Lin, Gaowei Chang, Weiwen Liu, Ying Wen, Yong Yu, Weinan Zhang

We expect this work to serve as a practical reference for both researchers and engineers seeking to design, evaluate, or integrate robust communication infrastructures for intelligent agents.

AI Agent Survey

AFiRe: Anatomy-Driven Self-Supervised Learning for Fine-Grained Representation in Radiographic Images

1 code implementation15 Apr 2025 Yihang Liu, Lianghua He, Ying Wen, Longzhen Yang, Hongzhou Chen

Current self-supervised methods, such as contrastive learning, predominantly focus on global discrimination, neglecting the critical fine-grained anatomical details required for accurate radiographic analysis.

Anatomy Anomaly Detection +4

ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning

1 code implementation12 Mar 2025 Ziyu Wan, Yunxiang Li, Xiaoyu Wen, Yan Song, Hanjing Wang, Linyi Yang, Mark Schmidt, Jun Wang, Weinan Zhang, Shuyue Hu, Ying Wen

To address this challenge, we introduce Reinforced Meta-thinking Agents (ReMA), a novel framework that leverages Multi-Agent Reinforcement Learning (MARL) to elicit meta-thinking behaviors, encouraging LLMs to think about thinking.

Multi-agent Reinforcement Learning reinforcement-learning +1

PMAT: Optimizing Action Generation Order in Multi-Agent Reinforcement Learning

1 code implementation23 Feb 2025 Kun Hu, Muning Wen, Xihuai Wang, Shao Zhang, Yiwei Shi, Minne Li, Minglong Li, Ying Wen

In this paper, we introduce Action Generation with Plackett-Luce Sampling (AGPS), a novel mechanism for agent decision order optimization.

Action Generation Decision Making +8

ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning

no code implementations22 Feb 2025 Shulin Huang, Linyi Yang, Yan Song, Shuang Chen, Leyang Cui, Ziyu Wan, Qingcheng Zeng, Ying Wen, Kun Shao, Weinan Zhang, Jun Wang, Yue Zhang

Evaluating large language models (LLMs) poses significant challenges, particularly due to issues of data contamination and the leakage of correct answers.

Retrieval-Augmented Process Reward Model for Generalizable Mathematical Reasoning

no code implementations20 Feb 2025 Jiachen Zhu, Congmin Zheng, Jianghao Lin, Kounianhua Du, Ying Wen, Yong Yu, Jun Wang, Weinan Zhang

By utilizing a two-stage retrieval-enhanced mechanism, RetrievalPRM retrieves semantically similar questions and steps as a warmup, enhancing PRM's ability to evaluate target steps and improving generalization and reasoning consistency across different models and problem types.

Mathematical Reasoning Retrieval

Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration

1 code implementation17 Feb 2025 Shao Zhang, Xihuai Wang, WenHao Zhang, Chaoran Li, Junru Song, Tingyu Li, Lin Qiu, Xuezhi Cao, Xunliang Cai, Wen Yao, Weinan Zhang, Xinbing Wang, Ying Wen

We propose DPT-Agent, a novel language agent framework that integrates System 1 and System 2 for efficient real-time simultaneous human-AI collaboration.

AT-Drone: Benchmarking Adaptive Teaming in Multi-Drone Pursuit

no code implementations13 Feb 2025 Yang Li, Junfan Chen, Feng Xue, Jiabin Qiu, Wenbin Li, Qingrui Zhang, Ying Wen, Wei Pan

To address this gap, we introduce AT-Drone, the first dedicated benchmark explicitly designed to facilitate comprehensive training and evaluation of adaptive teaming strategies in multi-drone pursuit scenarios.

Benchmarking Edge-computing

Language Games as the Pathway to Artificial Superhuman Intelligence

no code implementations31 Jan 2025 Ying Wen, Ziyu Wan, Shao Zhang

The evolution of large language models (LLMs) toward artificial superhuman intelligence (ASI) hinges on data reproduction, a cyclical process in which models generate, curate and retrain on novel data to refine capabilities.

Diversity

RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors

no code implementations14 Dec 2024 Fengshuo Bai, Runze Liu, Yali Du, Ying Wen, Yaodong Yang

RAT trains an intention policy that is explicitly aligned with human preferences, serving as a precise behavioral target for the adversary.

Adversarial Attack Deep Reinforcement Learning +1

KaLM: Knowledge-aligned Autoregressive Language Modeling via Dual-view Knowledge Graph Contrastive Learning

no code implementations6 Dec 2024 Peng Yu, Cheng Deng, Beiya Dai, Xinbing Wang, Ying Wen

This paper proposes \textbf{KaLM}, a \textit{Knowledge-aligned Language Modeling} approach, which fine-tunes autoregressive LLMs to align with KG knowledge via the joint objective of explicit knowledge alignment and implicit knowledge alignment.

Contrastive Learning Graph Question Answering +3

LLM-based Multi-Agent Systems: Techniques and Business Perspectives

no code implementations21 Nov 2024 Yingxuan Yang, Qiuying Peng, Jun Wang, Ying Wen, Weinan Zhang

In the era of (multi-modal) large language models, most operational processes can be reformulated and reproduced using LLM agents.

Natural Language Reinforcement Learning

1 code implementation21 Nov 2024 Xidong Feng, Bo Liu, Ziyu Wan, Haotian Fu, Girish A. Koushik, Zhiyuan Hu, Mengyue Yang, Ying Wen, Jun Wang

Reinforcement Learning (RL) mathematically formulates decision-making with Markov Decision Process (MDP).

Decision Making reinforcement-learning +2

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

1 code implementation12 Oct 2024 Jun Wang, Meng Fang, Ziyu Wan, Muning Wen, Jiachen Zhu, Anjie Liu, Ziqin Gong, Yan Song, Lei Chen, Lionel M. Ni, Linyi Yang, Ying Wen, Weinan Zhang

Inspired by the success of OpenAI's o1 model, which demonstrated improved reasoning abilities through step-by-step reasoning and reinforcement learning, OpenR integrates test-time compute, reinforcement learning, and process supervision to improve reasoning in LLMs.

Math reinforcement-learning +1

Computing Ex Ante Equilibrium in Heterogeneous Zero-Sum Team Games

no code implementations2 Oct 2024 Naming Liu, Mingzhi Wang, Xihuai Wang, Weinan Zhang, Yaodong Yang, Youzhi Zhang, Bo An, Ying Wen

Such insufficient policy expressiveness causes Team PSRO to be trapped into a sub-optimal ex ante equilibrium with significantly higher exploitability and never converges to the global ex ante equilibrium.

HOLA-Drone: Hypergraphic Open-ended Learning for Zero-Shot Multi-Drone Cooperative Pursuit

no code implementations13 Sep 2024 Yang Li, Dengyu Zhang, Junfan Chen, Ying Wen, Qingrui Zhang, Shaoshuai Mou, Wei Pan

In this paper, we extend the scope of ZSC research to the multi-drone cooperative pursuit scenario, exploring how to construct a drone agent capable of coordinating with multiple unseen partners to capture multiple evaders.

Fusion-PSRO: Nash Policy Fusion for Policy Space Response Oracles

no code implementations31 May 2024 Jiesong Lian, Yucong Huang, Chengdong Ma, Mingzhi Wang, Ying Wen, Long Hu, Yixue Hao

For solving zero-sum games involving non-transitivity, a useful approach is to maintain a policy population to approximate the Nash Equilibrium (NE).

Multi-agent Reinforcement Learning

Efficient Model-agnostic Alignment via Bayesian Persuasion

no code implementations29 May 2024 Fengshuo Bai, Mingzhi Wang, Zhaowei Zhang, Boyuan Chen, Yinda Xu, Ying Wen, Yaodong Yang

This paper explores an efficient method for aligning black-box large models using smaller models, introducing a model-agnostic and lightweight Bayesian Persuasion Alignment framework.

Code Generation Mathematical Reasoning +1

Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation

no code implementations29 May 2024 Fengshuo Bai, Rui Zhao, Hongming Zhang, Sijia Cui, Ying Wen, Yaodong Yang, Bo Xu, Lei Han

To boost the learning loop, we propose SEER, an efficient PbRL method that integrates label smoothing and policy regularization techniques.

reinforcement-learning Reinforcement Learning

TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision

1 code implementation10 Mar 2024 Ruiwen Zhou, Yingxuan Yang, Muning Wen, Ying Wen, Wenhao Wang, Chunling Xi, Guoqiang Xu, Yong Yu, Weinan Zhang

Among these works, many of them utilize in-context examples to achieve generalization without the need for fine-tuning, while few of them have considered the problem of how to select and effectively utilize these examples.

Language Modelling Large Language Model +2

Offline Fictitious Self-Play for Competitive Games

no code implementations29 Feb 2024 Jingxiao Chen, Weiji Xie, Weinan Zhang, Yong Yu, Ying Wen

Firstly, unaware of the game structure, it is impossible to interact with the opponents and conduct a major learning paradigm, self-play, for competitive games.

Offline RL Reinforcement Learning (RL)

DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning

1 code implementation27 Feb 2024 Siyuan Guo, Cheng Deng, Ying Wen, Hechang Chen, Yi Chang, Jun Wang

In this work, we investigate the potential of large language models (LLMs) based agents to automate data science tasks, with the goal of comprehending task requirements, then building and training the best-fit machine learning models.

Code Generation

Aligning Individual and Collective Objectives in Multi-Agent Cooperation

no code implementations19 Feb 2024 Yang Li, WenHao Zhang, Jianhong Wang, Shao Zhang, Yali Du, Ying Wen, Wei Pan

Among the research topics in multi-agent learning, mixed-motive cooperation is one of the most prominent challenges, primarily due to the mismatch between individual and collective goals.

SMAC+ Starcraft +1

Natural Language Reinforcement Learning

no code implementations11 Feb 2024 Xidong Feng, Ziyu Wan, Mengyue Yang, Ziyan Wang, Girish A. Koushik, Yali Du, Ying Wen, Jun Wang

Reinforcement Learning (RL) has shown remarkable abilities in learning policies for decision-making tasks.

Decision Making reinforcement-learning +2

Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement

1 code implementation9 Feb 2024 Muning Wen, Junwei Liao, Cheng Deng, Jun Wang, Weinan Zhang, Ying Wen

We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks; results underline ETPO's potential as a robust method for refining the interactive decision-making capabilities of language agents.

Code Generation Decision Making +4

IIRP-Net: Iterative Inference Residual Pyramid Network for Enhanced Image Registration

1 code implementation CVPR 2024 Tai Ma, Suwei Zhang, Jiafeng Li, Ying Wen

We conduct extensive experiments on the FLARE and Mindboggle datasets and the results verify the effectiveness of the proposed method outperforming state-of-the-art deformable image registration methods.

Image Registration

Adaptive Control Strategy for Quadruped Robots in Actuator Degradation Scenarios

1 code implementation29 Dec 2023 Xinyuan Wu, Wentao Dong, Hang Lai, Yong Yu, Ying Wen

Quadruped robots have strong adaptability to extreme environments but may also experience faults.

Critic-Guided Decision Transformer for Offline Reinforcement Learning

1 code implementation21 Dec 2023 Yuanfu Wang, Chao Yang, Ying Wen, Yu Liu, Yu Qiao

Recent advancements in offline reinforcement learning (RL) have underscored the capabilities of Return-Conditioned Supervised Learning (RCSL), a paradigm that learns the action distribution based on target returns for each state in a supervised manner.

D4RL Offline RL +4

Controlling Large Language Model-based Agents for Large-Scale Decision-Making: An Actor-Critic Approach

no code implementations23 Nov 2023 Bin Zhang, Hangyu Mao, Jingqing Ruan, Ying Wen, Yang Li, Shao Zhang, Zhiwei Xu, Dapeng Li, Ziyue Li, Rui Zhao, Lijuan Li, Guoliang Fan

The remarkable progress in Large Language Models (LLMs) opens up new avenues for addressing planning and decision-making problems in Multi-Agent Systems (MAS).

Decision Making Hallucination +4

ZSC-Eval: An Evaluation Toolkit and Benchmark for Multi-agent Zero-shot Coordination

2 code implementations8 Oct 2023 Xihuai Wang, Shao Zhang, WenHao Zhang, Wentao Dong, Jingxiao Chen, Ying Wen, Weinan Zhang

ZSC-Eval consists of: 1) Generation of evaluation partner candidates through behavior-preferring rewards to approximate deployment-time partners' distribution; 2) Selection of evaluation partners by Best-Response Diversity (BR-Div); 3) Measurement of generalization performance with various evaluation partners via the Best-Response Proximity (BR-Prox) metric.

Diversity Multi-agent Reinforcement Learning

GEAR: A GPU-Centric Experience Replay System for Large Reinforcement Learning Models

1 code implementation8 Oct 2023 Hanjing Wang, Man-Kit Sit, Congjie He, Ying Wen, Weinan Zhang, Jun Wang, Yaodong Yang, Luo Mai

This paper introduces a distributed, GPU-centric experience replay system, GEAR, designed to perform scalable reinforcement learning (RL) with large sequence models (such as transformers).

Reinforcement Learning (RL)

Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training

1 code implementation29 Sep 2023 Xidong Feng, Ziyu Wan, Muning Wen, Stephen Marcus McAleer, Ying Wen, Weinan Zhang, Jun Wang

Empirical results across reasoning, planning, alignment, and decision-making tasks show that TS-LLM outperforms existing approaches and can handle trees with a depth of 64.

Decision Making Language Modeling +2

Cross-Utterance Conditioned VAE for Speech Generation

no code implementations8 Sep 2023 Yang Li, Cheng Yu, Guangzhi Sun, Weiqin Zu, Zheng Tian, Ying Wen, Wei Pan, Chao Zhang, Jun Wang, Yang Yang, Fanglei Sun

Experimental results on the LibriTTS datasets demonstrate that our proposed models significantly enhance speech synthesis and editing, producing more natural and expressive speech.

Speech Synthesis text-to-speech +1

Large Sequence Models for Sequential Decision-Making: A Survey

no code implementations24 Jun 2023 Muning Wen, Runji Lin, Hanjing Wang, Yaodong Yang, Ying Wen, Luo Mai, Jun Wang, Haifeng Zhang, Weinan Zhang

Transformer architectures have facilitated the development of large-scale and general-purpose sequence models for prediction tasks in natural language processing and computer vision, e. g., GPT-3 and Swin Transformer.

Decision Making Sequential Decision Making +1

Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination

1 code implementation5 Jun 2023 Yang Li, Shao Zhang, Jichen Sun, WenHao Zhang, Yali Du, Ying Wen, Xinbing Wang, Wei Pan

In order to solve cooperative incompatibility in learning and effectively address the problem in the context of ZSC, we introduce the Cooperative Open-ended LEarning (COLE) framework, which formulates open-ended objectives in cooperative games with two players using perspectives of graph theory to evaluate and pinpoint the cooperative capacity of each strategy.

AI Agent

Order Matters: Agent-by-agent Policy Optimization

1 code implementation13 Feb 2023 Xihuai Wang, Zheng Tian, Ziyu Wan, Ying Wen, Jun Wang, Weinan Zhang

In this paper, we propose the \textbf{A}gent-by-\textbf{a}gent \textbf{P}olicy \textbf{O}ptimization (A2PO) algorithm to improve the sample efficiency and retain the guarantees of monotonic improvement for each agent during training.

MuJoCo

Cooperative Open-ended Learning Framework for Zero-shot Coordination

2 code implementations9 Feb 2023 Yang Li, Shao Zhang, Jichen Sun, Yali Du, Ying Wen, Xinbing Wang, Wei Pan

However, these approaches can result in a loss of learning and an inability to cooperate with certain strategies within the population, known as cooperative incompatibility.

Diversity

SCConv: Spatial and Channel Reconstruction Convolution for Feature Redundancy

1 code implementation CVPR 2023 Jiafeng Li, Ying Wen, Lianghua He

The proposed SCConv consists of two units: spatial reconstruction unit (SRU) and channel reconstruction unit (CRU).

On Realization of Intelligent Decision-Making in the Real World: A Foundation Decision Model Perspective

1 code implementation24 Dec 2022 Ying Wen, Ziyu Wan, Ming Zhou, Shufang Hou, Zhe Cao, Chenyang Le, Jingxiao Chen, Zheng Tian, Weinan Zhang, Jun Wang

The pervasive uncertainty and dynamic nature of real-world environments present significant challenges for the widespread implementation of machine-driven Intelligent Decision-Making (IDM) systems.

Decision Making Image Captioning +2

KnowledgeShovel: An AI-in-the-Loop Document Annotation System for Scientific Knowledge Base Construction

1 code implementation6 Oct 2022 Shao Zhang, Yuting Jia, Hui Xu, Dakuo Wang, Toby Jia-Jun Li, Ying Wen, Xinbing Wang, Chenghu Zhou

Constructing a comprehensive, accurate, and useful scientific knowledge base is crucial for human researchers synthesizing scientific knowledge and for enabling Al-driven scientific discovery.

Knowledge Base Construction scientific discovery

Multi-Agent Reinforcement Learning is a Sequence Modeling Problem

1 code implementation30 May 2022 Muning Wen, Jakub Grudzien Kuba, Runji Lin, Weinan Zhang, Ying Wen, Jun Wang, Yaodong Yang

In this paper, we introduce a novel architecture named Multi-Agent Transformer (MAT) that effectively casts cooperative multi-agent reinforcement learning (MARL) into SM problems wherein the task is to map agents' observation sequence to agents' optimal action sequence.

Decision Making MuJoCo +5

Multi-Agent Feedback Enabled Neural Networks for Intelligent Communications

1 code implementation22 May 2022 Fanglei Sun, Yang Li, Ying Wen, Jingchen Hu, Jun Wang, Yang Yang, Kai Li

The design of MAFENN framework and algorithm are dedicated to enhance the learning capability of the feedfoward DL networks or their variations with the simple data feedback.

Denoising Intelligent Communication

Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech

1 code implementation ACL 2022 Yang Li, Cheng Yu, Guangzhi Sun, Hua Jiang, Fanglei Sun, Weiqin Zu, Ying Wen, Yang Yang, Jun Wang

Modelling prosody variation is critical for synthesizing natural and expressive speech in end-to-end text-to-speech (TTS) systems.

Diversity text-to-speech +1

DeepShovel: An Online Collaborative Platform for Data Extraction in Geoscience Literature with AI Assistance

no code implementations21 Feb 2022 Shao Zhang, Yuting Jia, Hui Xu, Ying Wen, Dakuo Wang, Xinbing Wang

Geoscientists, as well as researchers in many fields, need to read a huge amount of literature to locate, extract, and aggregate relevant results and data to enable future research or to build a scientific database, but there is no existing system to support this use case well.

Efficient Policy Space Response Oracles

no code implementations28 Jan 2022 Ming Zhou, Jingxiao Chen, Ying Wen, Weinan Zhang, Yaodong Yang, Yong Yu, Jun Wang

Policy Space Response Oracle methods (PSRO) provide a general solution to learn Nash equilibrium in two-player zero-sum games but suffer from two drawbacks: (1) the computation inefficiency due to the need for consistent meta-game evaluation via simulations, and (2) the exploration inefficiency due to finding the best response against a fixed meta-strategy at every epoch.

Efficient Exploration

Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks

1 code implementation6 Dec 2021 Linghui Meng, Muning Wen, Yaodong Yang, Chenyang Le, Xiyun Li, Weinan Zhang, Ying Wen, Haifeng Zhang, Jun Wang, Bo Xu

In this paper, we facilitate the research by providing large-scale datasets, and use them to examine the usage of the Decision Transformer in the context of MARL.

All Offline RL +5

Neural Auto-Curricula in Two-Player Zero-Sum Games

1 code implementation NeurIPS 2021 Xidong Feng, Oliver Slumbers, Ziyu Wan, Bo Liu, Stephen Mcaleer, Ying Wen, Jun Wang, Yaodong Yang

When solving two-player zero-sum games, multi-agent reinforcement learning (MARL) algorithms often create populations of agents where, at each iteration, a new agent is discovered as the best response to a mixture over the opponent population.

Multi-agent Reinforcement Learning Vocal Bursts Valence Prediction

Towards Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games

1 code implementation NeurIPS 2021 Xiangyu Liu, Hangtian Jia, Ying Wen, Yaodong Yang, Yujing Hu, Yingfeng Chen, Changjie Fan, Zhipeng Hu

With this unified diversity measure, we design the corresponding diversity-promoting objective and population effectivity when seeking the best responses in open-ended learning.

Diversity

A Game-Theoretic Approach to Multi-Agent Trust Region Optimization

1 code implementation12 Jun 2021 Ying Wen, Hui Chen, Yaodong Yang, Zheng Tian, Minne Li, Xu Chen, Jun Wang

Trust region methods are widely applied in single-agent reinforcement learning problems due to their monotonic performance-improvement guarantee at every iteration.

Atari Games MuJoCo +4

Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games

no code implementations9 Jun 2021 Xiangyu Liu, Hangtian Jia, Ying Wen, Yaodong Yang, Yujing Hu, Yingfeng Chen, Changjie Fan, Zhipeng Hu

With this unified diversity measure, we design the corresponding diversity-promoting objective and population effectivity when seeking the best responses in open-ended learning.

Diversity

MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning

1 code implementation5 Jun 2021 Ming Zhou, Ziyu Wan, Hanjing Wang, Muning Wen, Runzhe Wu, Ying Wen, Yaodong Yang, Weinan Zhang, Jun Wang

Our framework is comprised of three key components: (1) a centralized task dispatching model, which supports the self-generated tasks and scalable training with heterogeneous policy combinations; (2) a programming architecture named Actor-Evaluator-Learner, which achieves high parallelism for both training and sampling, and meets the evaluation requirement of auto-curriculum learning; (3) a higher-level abstraction of MARL training paradigms, which enables efficient code reuse and flexible deployments on different distributed computing paradigms.

Atari Games Distributed Computing +4

Neural Auto-Curricula

1 code implementation4 Jun 2021 Xidong Feng, Oliver Slumbers, Ziyu Wan, Bo Liu, Stephen Mcaleer, Ying Wen, Jun Wang, Yaodong Yang

When solving two-player zero-sum games, multi-agent reinforcement learning (MARL) algorithms often create populations of agents where, at each iteration, a new agent is discovered as the best response to a mixture over the opponent population.

Multi-agent Reinforcement Learning

Modelling Behavioural Diversity for Learning in Open-Ended Games

3 code implementations14 Mar 2021 Nicolas Perez Nieves, Yaodong Yang, Oliver Slumbers, David Henry Mguni, Ying Wen, Jun Wang

Promoting behavioural diversity is critical for solving games with non-transitive dynamics where strategic cycles exist, and there is no consistent winner (e. g., Rock-Paper-Scissors).

Diversity Point Processes

Multi-Agent Trust Region Learning

1 code implementation1 Jan 2021 Ying Wen, Hui Chen, Yaodong Yang, Zheng Tian, Minne Li, Xu Chen, Jun Wang

We derive the lower bound of agents' payoff improvements for MATRL methods, and also prove the convergence of our method on the meta-game fixed points.

Atari Games MuJoCo +4

Convolutional Neural Network optimization via Channel Reassessment Attention module

no code implementations12 Oct 2020 YuTao Shen, Ying Wen

The performance of convolutional neural networks (CNNs) can be improved by adjusting the interrelationship between channels with attention mechanism.

Multi-Agent Determinantal Q-Learning

1 code implementation ICML 2020 Yaodong Yang, Ying Wen, Li-Heng Chen, Jun Wang, Kun Shao, David Mguni, Wei-Nan Zhang

Though practical, current methods rely on restrictive assumptions to decompose the centralized value function across agents for execution.

Q-Learning

Segmenting Medical MRI via Recurrent Decoding Cell

1 code implementation21 Nov 2019 Ying Wen, Kai Xie, Lianghua He

The encoder-decoder networks are commonly used in medical image segmentation due to their remarkable performance in hierarchical feature fusion.

Decoder Image Segmentation +2

A Regularized Opponent Model with Maximum Entropy Objective

1 code implementation17 May 2019 Zheng Tian, Ying Wen, Zhichen Gong, Faiz Punakkath, Shihao Zou, Jun Wang

In a single-agent setting, reinforcement learning (RL) tasks can be cast into an inference problem by introducing a binary random variable o, which stands for the "optimality".

model Multi-agent Reinforcement Learning +3

Joint Perception and Control as Inference with an Object-based Implementation

no code implementations4 Mar 2019 Minne Li, Zheng Tian, Pranav Nashikkar, Ian Davies, Ying Wen, Jun Wang

Existing model-based reinforcement learning methods often study perception modeling and decision making separately.

Bayesian Inference Decision Making +2

Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning

no code implementations ICLR 2019 Ying Wen, Yaodong Yang, Rui Luo, Jun Wang, Wei Pan

Our methods are tested on both the matrix game and the differential game, which have a non-trivial equilibrium where common gradient-based methods fail to converge.

Multi-agent Reinforcement Learning reinforcement-learning +2

Modelling Bounded Rationality in Multi-Agent Interactions by Generalized Recursive Reasoning

1 code implementation26 Jan 2019 Ying Wen, Yaodong Yang, Rui Luo, Jun Wang

Though limited in real-world decision making, most multi-agent reinforcement learning (MARL) models assume perfectly rational agents -- a property hardly met due to individual's cognitive limitation and/or the tractability of the decision problem.

Decision Making Multi-agent Reinforcement Learning +1

Factorized Q-Learning for Large-Scale Multi-Agent Systems

no code implementations11 Sep 2018 Yong Chen, Ming Zhou, Ying Wen, Yaodong Yang, Yufeng Su, Wei-Nan Zhang, Dell Zhang, Jun Wang, Han Liu

Deep Q-learning has achieved a significant success in single-agent decision making tasks.

Multiagent Systems

A Study of AI Population Dynamics with Million-agent Reinforcement Learning

no code implementations13 Sep 2017 Yaodong Yang, Lantao Yu, Yiwei Bai, Jun Wang, Wei-Nan Zhang, Ying Wen, Yong Yu

We conduct an empirical study on discovering the ordered collective dynamics obtained by a population of intelligence agents, driven by million-agent reinforcement learning.

Deep Reinforcement Learning reinforcement-learning +1

Learning to Design Games: Strategic Environments in Reinforcement Learning

no code implementations5 Jul 2017 Haifeng Zhang, Jun Wang, Zhiming Zhou, Wei-Nan Zhang, Ying Wen, Yong Yu, Wenxin Li

In typical reinforcement learning (RL), the environment is assumed given and the goal of the learning is to identify an optimal policy for the agent taking actions through its interactions with the environment.

Game Design reinforcement-learning +2

Product-based Neural Networks for User Response Prediction

11 code implementations1 Nov 2016 Yanru Qu, Han Cai, Kan Ren, Wei-Nan Zhang, Yong Yu, Ying Wen, Jun Wang

Predicting user responses, such as clicks and conversions, is of great importance and has found its usage in many Web applications including recommender systems, web search and online advertising.

Click-Through Rate Prediction Prediction +1

Learning text representation using recurrent convolutional neural network with highway layers

no code implementations22 Jun 2016 Ying Wen, Wei-Nan Zhang, Rui Luo, Jun Wang

Recently, the rapid development of word embedding and neural networks has brought new inspiration to various NLP and IR tasks.

Sentiment Analysis

Cannot find the paper you are looking for? You can Submit a new open access paper.