Search Results for author: Haifeng Zhang

Found 22 papers, 9 papers with code

Token-level Direct Preference Optimization

1 code implementation18 Apr 2024 Yongcheng Zeng, Guoqing Liu, Weiyu Ma, Ning Yang, Haifeng Zhang, Jun Wang

Fine-tuning pre-trained Large Language Models (LLMs) is essential to align them with human values and intentions.

Learning Macroeconomic Policies based on Microfoundations: A Stackelberg Mean Field Game Approach

no code implementations14 Mar 2024 Qirui Mi, Zhiyu Zhao, Siyu Xia, Yan Song, Jun Wang, Haifeng Zhang

Effective macroeconomic policies play a crucial role in promoting economic growth and social stability.

Large Language Models Play StarCraft II: Benchmarks and A Chain of Summarization Approach

1 code implementation19 Dec 2023 Weiyu Ma, Qirui Mi, Xue Yan, Yuqiao Wu, Runji Lin, Haifeng Zhang, Jun Wang

StarCraft II is a challenging benchmark for AI agents due to the necessity of both precise micro level operations and strategic macro awareness.

Language Modelling Large Language Model +2

AI-Based Energy Transportation Safety: Pipeline Radial Threat Estimation Using Intelligent Sensing System

no code implementations18 Dec 2023 Chengyuan Zhu, Yiyuan Yang, Kaixiang Yang, Haifeng Zhang, Qinmin Yang, C. L. Philip Chen

This refinement is crucial in effectively identifying genuine threats to pipelines, thus enhancing the safety of energy transportation.

Transfer Learning

Ask more, know better: Reinforce-Learned Prompt Questions for Decision Making with Large Language Models

no code implementations27 Oct 2023 Xue Yan, Yan Song, Xinyu Cui, Filippos Christianos, Haifeng Zhang, David Henry Mguni, Jun Wang

To that purpose, we offer a new leader-follower bilevel framework that is capable of learning to ask relevant questions (prompts) and subsequently undertaking reasoning to guide the learning of actions.

Decision Making

Large Sequence Models for Sequential Decision-Making: A Survey

no code implementations24 Jun 2023 Muning Wen, Runji Lin, Hanjing Wang, Yaodong Yang, Ying Wen, Luo Mai, Jun Wang, Haifeng Zhang, Weinan Zhang

Transformer architectures have facilitated the development of large-scale and general-purpose sequence models for prediction tasks in natural language processing and computer vision, e. g., GPT-3 and Swin Transformer.

Decision Making

An Empirical Study on Google Research Football Multi-agent Scenarios

1 code implementation16 May 2023 Yan Song, He Jiang, Zheng Tian, Haifeng Zhang, Yingping Zhang, Jiangcheng Zhu, Zonghong Dai, Weinan Zhang, Jun Wang

Few multi-agent reinforcement learning (MARL) research on Google Research Football (GRF) focus on the 11v11 multi-agent full-game scenario and to the best of our knowledge, no open benchmark on this scenario has been released to the public.

Benchmarking Multi-agent Reinforcement Learning +1

Contextual Transformer for Offline Meta Reinforcement Learning

no code implementations15 Nov 2022 Runji Lin, Ye Li, Xidong Feng, Zhaowei Zhang, Xian Hong Wu Fung, Haifeng Zhang, Jun Wang, Yali Du, Yaodong Yang

Firstly, we propose prompt tuning for offline RL, where a context vector sequence is concatenated with the input to guide the conditional policy generation.

D4RL Meta Reinforcement Learning +4

Learning to Identify Top Elo Ratings: A Dueling Bandits Approach

1 code implementation12 Jan 2022 Xue Yan, Yali Du, Binxin Ru, Jun Wang, Haifeng Zhang, Xu Chen

The Elo rating system is widely adopted to evaluate the skills of (chess) game and sports players.


A Theoretical Understanding of Gradient Bias in Meta-Reinforcement Learning

1 code implementation31 Dec 2021 Xidong Feng, Bo Liu, Jie Ren, Luo Mai, Rui Zhu, Haifeng Zhang, Jun Wang, Yaodong Yang

Gradient-based Meta-RL (GMRL) refers to methods that maintain two-level optimisation procedures wherein the outer-loop meta-learner guides the inner-loop gradient-based reinforcement learner to achieve fast adaptations.

Atari Games Meta Reinforcement Learning +3

Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks

1 code implementation6 Dec 2021 Linghui Meng, Muning Wen, Yaodong Yang, Chenyang Le, Xiyun Li, Weinan Zhang, Ying Wen, Haifeng Zhang, Jun Wang, Bo Xu

In this paper, we facilitate the research by providing large-scale datasets, and use them to examine the usage of the Decision Transformer in the context of MARL.

Offline RL reinforcement-learning +4

A Game-Theoretic Approach for Improving Generalization Ability of TSP Solvers

no code implementations28 Oct 2021 Chenguang Wang, Yaodong Yang, Oliver Slumbers, Congying Han, Tiande Guo, Haifeng Zhang, Jun Wang

In this paper, we introduce a two-player zero-sum framework between a trainable \emph{Solver} and a \emph{Data Generator} to improve the generalization ability of deep learning-based solvers for Traveling Salesman Problem (TSP).

Traveling Salesman Problem

Settling the Variance of Multi-Agent Policy Gradients

1 code implementation NeurIPS 2021 Jakub Grudzien Kuba, Muning Wen, Yaodong Yang, Linghui Meng, Shangding Gu, Haifeng Zhang, David Henry Mguni, Jun Wang

In multi-agent RL (MARL), although the PG theorem can be naturally extended, the effectiveness of multi-agent PG (MAPG) methods degrades as the variance of gradient estimates increases rapidly with the number of agents.

Reinforcement Learning (RL) Starcraft

Learning Predictive Communication by Imagination in Networked System Control

no code implementations1 Jan 2021 Yali Du, Yifan Zhao, Meng Fang, Jun Wang, Gangyan Xu, Haifeng Zhang

Dealing with multi-agent control in networked systems is one of the biggest challenges in Reinforcement Learning (RL) and limited success has been presented compared to recent deep reinforcement learning in single-agent domain.

reinforcement-learning Reinforcement Learning (RL)

Improving Knowledge Tracing via Pre-training Question Embeddings

1 code implementation9 Dec 2020 Yunfei Liu, Yang Yang, Xianyu Chen, Jian Shen, Haifeng Zhang, Yong Yu

Knowledge tracing (KT) defines the task of predicting whether students can correctly answer questions based on their historical response.

Knowledge Tracing

Signal Instructed Coordination in Cooperative Multi-agent Reinforcement Learning

no code implementations10 Sep 2019 Liheng Chen, Hongyi Guo, Yali Du, Fei Fang, Haifeng Zhang, Yaoming Zhu, Ming Zhou, Wei-Nan Zhang, Qing Wang, Yong Yu

Although existing works formulate this problem into a centralized learning with decentralized execution framework, which avoids the non-stationary problem in training, their decentralized execution paradigm limits the agents' capability to coordinate.

Multi-agent Reinforcement Learning reinforcement-learning +1

Bi-level Actor-Critic for Multi-agent Coordination

1 code implementation8 Sep 2019 Haifeng Zhang, Weizhe Chen, Zeren Huang, Minne Li, Yaodong Yang, Wei-Nan Zhang, Jun Wang

Coordination is one of the essential problems in multi-agent systems.

Multiagent Systems

Layout Design for Intelligent Warehouse by Evolution with Fitness Approximation

no code implementations14 Nov 2018 Haifeng Zhang, Zilong Guo, Han Cai, Chris Wang, Wei-Nan Zhang, Yong Yu, Wenxin Li, Jun Wang

With the rapid growth of the express industry, intelligent warehouses that employ autonomous robots for carrying parcels have been widely used to handle the vast express volume.

Layout Design

Learning to Design Games: Strategic Environments in Reinforcement Learning

no code implementations5 Jul 2017 Haifeng Zhang, Jun Wang, Zhiming Zhou, Wei-Nan Zhang, Ying Wen, Yong Yu, Wenxin Li

In typical reinforcement learning (RL), the environment is assumed given and the goal of the learning is to identify an optimal policy for the agent taking actions through its interactions with the environment.

reinforcement-learning Reinforcement Learning (RL)

Empirically Grounded Agent-Based Models of Innovation Diffusion: A Critical Review

no code implementations30 Aug 2016 Haifeng Zhang, Yevgeniy Vorobeychik

Innovation diffusion has been studied extensively in a variety of disciplines, including sociology, economics, marketing, ecology, and computer science.

Marketing Sociology

Cannot find the paper you are looking for? You can Submit a new open access paper.