1 code implementation • 22 Nov 2023 • Kai Yang, Jian Tao, Jiafei Lyu, Chunjiang Ge, Jiaxin Chen, Qimai Li, Weihan Shen, Xiaolong Zhu, Xiu Li
The direct preference optimization (DPO) method, effective in fine-tuning large language models, eliminates the necessity for a reward model.
1 code implementation • 7 Nov 2023 • Enhong Liu, Joseph Suarez, Chenhui You, Bo Wu, BingCheng Chen, Jun Hu, Jiaxin Chen, Xiaolong Zhu, Clare Zhu, Julian Togelius, Sharada Mohanty, Weijun Hong, Rui Du, Yibing Zhang, Qinwen Wang, Xinhang Li, Zheng Yuan, Xiang Li, Yuejia Huang, Kun Zhang, Hanhui Yang, Shiqi Tang, Phillip Isola
In this paper, we present the results of the NeurIPS-2022 Neural MMO Challenge, which attracted 500 participants and received over 1, 600 submissions.
no code implementations • 30 Aug 2023 • Yangkun Chen, Joseph Suarez, Junjie Zhang, Chenghui Yu, Bo Wu, HanMo Chen, Hengman Zhu, Rui Du, Shanliang Qian, Shuai Liu, Weijun Hong, Jinke He, Yibing Zhang, Liang Zhao, Clare Zhu, Julian Togelius, Sharada Mohanty, Jiaxin Chen, Xiu Li, Xiaolong Zhu, Phillip Isola
We present the results of the second Neural MMO challenge, hosted at IJCAI 2022, which received 1600+ submissions.
1 code implementation • 4 Jan 2023 • HanMo Chen, Stone Tao, Jiaxin Chen, Weihan Shen, Xihui Li, Chenghui Yu, Sikai Cheng, Xiaolong Zhu, Xiu Li
Since these learned group strategies arise from individual decisions without an explicit coordination mechanism, we claim that artificial collective intelligence emerges from massive-agent cooperation and competition.
1 code implementation • 24 Oct 2022 • Yuhao Jiang, Kunjie Zhang, Qimai Li, Jiaxin Chen, Xiaolong Zhu
In recent years, Multi-Agent Path Finding (MAPF) has attracted attention from the fields of both Operations Research (OR) and Reinforcement Learning (RL).
no code implementations • 19 Jul 2021 • Xiaolong Zhu, Fernando Vanegas, Felipe Gonzalez, Conrad Sanderson
Performance of the system with an increasing number of UAVs in several indoor scenarios with obstacles is tested.
no code implementations • CVPR 2017 • Hao-Zhi Huang, Hao Wang, Wenhan Luo, Lin Ma, Wenhao Jiang, Xiaolong Zhu, Zhifeng Li, Wei Liu
More specifically, a hybrid loss is proposed to capitalize on the content information of input frames, the style information of a given style image, and the temporal information of consecutive frames.