no code implementations • 14 Aug 2024 • Xiaoyang Yu, Youfang Lin, Shuo Wang, Kai Lv, Sheng Han
To further improve the training of extra UAS parameters, we introduce a Cross-Group Inverse (CGI) loss to predict other groups' agent policies with the trajectory information.
no code implementations • 21 Jul 2024 • Xiaoran Liu, Ruixiao Li, Qipeng Guo, Zhigeng Liu, Yuerong Song, Kai Lv, Hang Yan, Linlin Li, Qun Liu, Xipeng Qiu
The long-context capability of the Large Language Models (LLM) has made significant breakthroughs, but the maximum supported context length remains a critical bottleneck limiting their practical applications.
3 code implementations • 26 Mar 2024 • Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang, Penglong Jiao, Zhenjiang Jin, Zhikai Lei, Jiaxing Li, Jingwen Li, Linyang Li, Shuaibin Li, Wei Li, Yining Li, Hongwei Liu, Jiangning Liu, Jiawei Hong, Kaiwen Liu, Kuikun Liu, Xiaoran Liu, Chengqi Lv, Haijun Lv, Kai Lv, Li Ma, Runyuan Ma, Zerun Ma, Wenchang Ning, Linke Ouyang, Jiantao Qiu, Yuan Qu, FuKai Shang, Yunfan Shao, Demin Song, Zifan Song, Zhihao Sui, Peng Sun, Yu Sun, Huanze Tang, Bin Wang, Guoteng Wang, Jiaqi Wang, Jiayu Wang, Rui Wang, Yudong Wang, Ziyi Wang, Xingjian Wei, Qizhen Weng, Fan Wu, Yingtong Xiong, Chao Xu, Ruiliang Xu, Hang Yan, Yirong Yan, Xiaogui Yang, Haochen Ye, Huaiyuan Ying, JIA YU, Jing Yu, Yuhang Zang, Chuyu Zhang, Li Zhang, Pan Zhang, Peng Zhang, Ruijie Zhang, Shuo Zhang, Songyang Zhang, Wenjian Zhang, Wenwei Zhang, Xingcheng Zhang, Xinyue Zhang, Hui Zhao, Qian Zhao, Xiaomeng Zhao, Fengzhe Zhou, Zaida Zhou, Jingming Zhuo, Yicheng Zou, Xipeng Qiu, Yu Qiao, Dahua Lin
The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI).
Ranked #5 on Long-Context Understanding on Ada-LEval (BestAnswer)
1 code implementation • 21 Feb 2024 • Kai Lv, Xiaoran Liu, Qipeng Guo, Hang Yan, Conghui He, Xipeng Qiu, Dahua Lin
The quality of training data are crucial for enhancing the long-text capabilities of foundation models.
no code implementations • 6 Dec 2023 • Xiaobo Hu, Youfang Lin, Hehe Fan, Shuo Wang, Zhihao Wu, Kai Lv
To this end, an agent needs to 1) learn a piece of certain knowledge about the relations of object categories in the world during training and 2) look for the target object based on the pre-learned object category relations and its moving trajectory in the current unseen environment.
no code implementations • 4 Dec 2023 • Xiaobo Hu, Youfang Lin, Yue Liu, Jinwen Wang, Shuo Wang, Hehe Fan, Kai Lv
Visual reinforcement learning has proven effective in solving control tasks with high-dimensional observations.
1 code implementation • 1 Dec 2023 • Kai Lv, Shuo Zhang, Tianle Gu, Shuhao Xing, Jiawei Hong, Keyu Chen, Xiaoran Liu, Yuqing Yang, Honglin Guo, Tengxiao Liu, Yu Sun, Qipeng Guo, Hang Yan, Xipeng Qiu
This paper introduces CoLLiE, an efficient library that facilitates collaborative training of large language models using 3D parallelism, parameter-efficient fine-tuning (PEFT) methods, and optimizers such as Lion, Adan, Sophia, LOMO and AdaLomo.
1 code implementation • 16 Oct 2023 • Kai Lv, Hang Yan, Qipeng Guo, Haijun Lv, Xipeng Qiu
Our experiments with instruction-tuning and further pre-training demonstrate that AdaLomo achieves results on par with AdamW, while significantly reducing memory requirements, thereby lowering the hardware barrier to training large language models.
1 code implementation • 16 Jun 2023 • Kai Lv, Yuqing Yang, Tengxiao Liu, Qinghui Gao, Qipeng Guo, Xipeng Qiu
Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP) but demand massive GPU resources for training.
1 code implementation • 7 May 2023 • Xiaonan Li, Kai Lv, Hang Yan, Tianyang Lin, Wei Zhu, Yuan Ni, Guotong Xie, Xiaoling Wang, Xipeng Qiu
To train UDR, we cast various tasks' training signals into a unified list-wise ranking formulation by language model's feedback.
1 code implementation • 2 Mar 2023 • Xiaoyang Yu, Youfang Lin, Xiangsen Wang, Sheng Han, Kai Lv
We firstly define and describe the heterogeneous problems in SMAC.
2 code implementations • 29 May 2022 • Chenxin An, Jiangtao Feng, Kai Lv, Lingpeng Kong, Xipeng Qiu, Xuanjing Huang
We validate CoNT on five generation tasks with ten benchmarks, including machine translation, summarization, code comment generation, data-to-text generation and commonsense generation.
no code implementations • 23 Jan 2022 • Kai Lv
When estimating the relationship between two images, it is often disturbed by outliers.
no code implementations • 18 Jan 2022 • Hengrui Zhang, Youfang Lin, Sheng Han, Shuo Wang, Kai Lv
Then, CDMPO uses a conservative value function loss to reduce the number of violations of constraints during the exploration process.
Distributional Reinforcement Learning reinforcement-learning +2
no code implementations • 29 Nov 2021 • Xiaobo Hu, Youfang Lin, Shuo Wang, Zhihao Wu, Kai Lv
ACRG is a highly effective structure that consists of two relationships, i. e., the horizontal relationship among objects and the distance relationship between the agent and objects .
no code implementations • 26 Jan 2021 • Kai Lv, Zongqing Lu, Qingmin Liao
By the new descriptor, we can obtain more high confidence matching points without extremum operation.