Search Results for author: Jiangcheng Zhu

Found 11 papers, 4 papers with code

Yi: Open Foundation Models by 01.AI

1 code implementation • 7 Mar 2024 • 01. AI, :, Alex Young, Bei Chen, Chao Li, Chengen Huang, Ge Zhang, Guanwei Zhang, Heng Li, Jiangcheng Zhu, Jianqun Chen, Jing Chang, Kaidong Yu, Peng Liu, Qiang Liu, Shawn Yue, Senbin Yang, Shiming Yang, Tao Yu, Wen Xie, Wenhao Huang, Xiaohui Hu, Xiaoyi Ren, Xinyao Niu, Pengcheng Nie, Yuchi Xu, Yudong Liu, Yue Wang, Yuxuan Cai, Zhenyu Gu, Zhiyuan Liu, Zonghong Dai

The Yi model family is based on 6B and 34B pretrained language models, then we extend them to chat models, 200K long context models, depth-upscaled models, and vision-language models.

Attribute Chatbot +2

7,100

Paper
Code

JiangJun: Mastering Xiangqi by Tackling Non-Transitivity in Two-Player Zero-Sum Games

no code implementations • 9 Aug 2023 • Yang Li, Kun Xiong, Yingping Zhang, Jiangcheng Zhu, Stephen Mcaleer, Wei Pan, Jun Wang, Zonghong Dai, Yaodong Yang

This paper presents an empirical exploration of non-transitivity in perfect-information games, specifically focusing on Xiangqi, a traditional Chinese board game comparable in game-tree complexity to chess and shogi.

Paper
Add Code

An Empirical Study on Google Research Football Multi-agent Scenarios

1 code implementation • 16 May 2023 • Yan Song, He Jiang, Zheng Tian, Haifeng Zhang, Yingping Zhang, Jiangcheng Zhu, Zonghong Dai, Weinan Zhang, Jun Wang

Few multi-agent reinforcement learning (MARL) research on Google Research Football (GRF) focus on the 11v11 multi-agent full-game scenario and to the best of our knowledge, no open benchmark on this scenario has been released to the public.

Benchmarking Multi-agent Reinforcement Learning +1

Paper
Code

Learnable Behavior Control: Breaking Atari Human World Records via Sample-Efficient Behavior Selection

no code implementations • 9 May 2023 • Jiajun Fan, Yuzheng Zhuang, Yuecheng Liu, Jianye Hao, Bin Wang, Jiangcheng Zhu, Hao Wang, Shu-Tao Xia

The exploration problem is one of the main challenges in deep reinforcement learning (RL).

Ranked #1 on Atari Games on Atari-57

Atari Games Reinforcement Learning (RL)

Paper
Add Code

LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent Reinforcement Learning

no code implementations • 5 May 2022 • Mingyu Yang, Jian Zhao, Xunhan Hu, Wengang Zhou, Jiangcheng Zhu, Houqiang Li

In this way, agents dealing with the same subtask share their learning of specific abilities and different subtasks correspond to different specific abilities.

Multi-agent Reinforcement Learning reinforcement-learning +3

Paper
Add Code

CTDS: Centralized Teacher with Decentralized Student for Multi-Agent Reinforcement Learning

1 code implementation • 16 Mar 2022 • Jian Zhao, Xunhan Hu, Mingyu Yang, Wengang Zhou, Jiangcheng Zhu, Houqiang Li

In this way, CTDS balances the full utilization of global observation during training and the feasibility of decentralized execution for online inference.

Multi-agent Reinforcement Learning reinforcement-learning +3

Paper
Code

MCMARL: Parameterizing Value Function via Mixture of Categorical Distributions for Multi-Agent Reinforcement Learning

1 code implementation • 21 Feb 2022 • Jian Zhao, Mingyu Yang, Youpeng Zhao, Xunhan Hu, Wengang Zhou, Jiangcheng Zhu, Houqiang Li

Specifically, we model both individual Q-values and global Q-value with categorical distribution.

Multi-agent Reinforcement Learning Starcraft +1

Paper
Code

Revisiting QMIX: Discriminative Credit Assignment by Gradient Entropy Regularization

no code implementations • 9 Feb 2022 • Jian Zhao, Yue Zhang, Xunhan Hu, Weixun Wang, Wengang Zhou, Jianye Hao, Jiangcheng Zhu, Houqiang Li

In cooperative multi-agent systems, agents jointly take actions and receive a team reward instead of individual rewards.

Paper
Add Code

Look Before You Leap: Safe Model-Based Reinforcement Learning with Human Intervention

no code implementations • 10 Nov 2021 • Yunkun Xu, Zhenyu Liu, Guifang Duan, Jiangcheng Zhu, Xiaolong Bai, Jianrong Tan

Safety has become one of the main challenges of applying deep reinforcement learning to real world systems.

Blocking Decision Making +3

Paper
Add Code

An Optimal Resource Allocator of Elastic Training for Deep Learning Jobs on Cloud

no code implementations • 8 Sep 2021 • Liang Hu, Jiangcheng Zhu, Zirui Zhou, Ruiqing Cheng, Xiaolong Bai, Yong Zhang

Cloud training platforms, such as Amazon Web Services and Huawei Cloud provide users with computational resources to train their deep learning jobs.

Decision Making

Paper
Add Code

Learning to Shape Rewards using a Game of Two Partners

no code implementations • 16 Mar 2021 • David Mguni, Taher Jafferjee, Jianhong Wang, Nicolas Perez-Nieves, Tianpei Yang, Matthew Taylor, Wenbin Song, Feifei Tong, Hui Chen, Jiangcheng Zhu, Jun Wang, Yaodong Yang

Reward shaping (RS) is a powerful method in reinforcement learning (RL) for overcoming the problem of sparse or uninformative rewards.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.