3 code implementations • ACL 2022 • Yanzeng Li, Jiangxia Cao, Xin Cong, Zhenyu Zhang, Bowen Yu, Hongsong Zhu, Tingwen Liu
Chinese pre-trained language models usually exploit contextual character information to learn representations, while ignoring the linguistics knowledge, e. g., word and sentence information.
1 code implementation • 9 Jun 2025 • MiniCPM Team, Chaojun Xiao, YuXuan Li, Xu Han, Yuzhuo Bai, Jie Cai, Haotian Chen, Wentong Chen, Xin Cong, Ganqu Cui, Ning Ding, Shengdan Fan, Yewei Fang, Zixuan Fu, Wenyu Guan, Yitong Guan, Junshao Guo, Yufeng Han, Bingxiang He, Yuxiang Huang, Cunliang Kong, Qiuzuo Li, Siyuan Li, Wenhao Li, Yanghao Li, Yishan Li, Zhen Li, Dan Liu, Biyuan Lin, Yankai Lin, Xiang Long, Quanyu Lu, Yaxi Lu, Peiyan Luo, Hongya Lyu, Litu Ou, Yinxu Pan, Zekai Qu, Qundong Shi, Zijun Song, Jiayuan Su, Zhou Su, Ao Sun, Xianghui Sun, Peijun Tang, Fangzheng Wang, Feng Wang, Shuo Wang, Yudong Wang, Yesai Wu, Zhenyu Xiao, Jie Xie, Zihao Xie, Yukun Yan, Jiarui Yuan, Kaihuo Zhang, Lei Zhang, Linyue Zhang, Xueren Zhang, Yudi Zhang, Hengyu Zhao, Weilin Zhao, Weilun Zhao, Yuanqian Zhao, Zhi Zheng, Ge Zhou, Jie zhou, Wei Zhou, Zihan Zhou, Zixuan Zhou, Zhiyuan Liu, Guoyang Zeng, Chao Jia, Dahai Li, Maosong Sun
Specifically, in terms of model architecture, we propose InfLLM v2, a trainable sparse attention mechanism that accelerates both prefilling and decoding phases for long-context processing.
1 code implementation • 2 Jun 2025 • Zhong Zhang, Yaxi Lu, Yikun Fu, Yupeng Huo, Shenzhi Yang, Yesai Wu, Han Si, Xin Cong, Haotian Chen, Yankai Lin, Jie Xie, Wei Zhou, Wang Xu, Yuanheng Zhang, Zhou Su, Zhongwu Zhai, Xiaoming Liu, Yudong Mei, JianMing Xu, Hongyan Tian, Chongyi Wang, Chi Chen, Yuan YAO, Zhiyuan Liu, Maosong Sun
The recent progress of large language model agents has opened new possibilities for automating tasks through graphical user interfaces (GUIs), especially in mobile environments where intelligent interaction can greatly enhance usability.
1 code implementation • 13 May 2025 • Yanxi Zhang, Xin Cong, Zhong Zhang, Xiao Liu, Dongyan Zhao, Yesai Wu
Actual causality (AC), a fundamental aspect of causal reasoning (CR), is responsible for attribution and responsibility assignment in real-world scenarios.
no code implementations • 25 Feb 2025 • Yu Xia, Jingru Fan, Weize Chen, Siyu Yan, Xin Cong, Zhong Zhang, Yaxi Lu, Yankai Lin, Zhiyuan Liu, Maosong Sun
Based on this finding, we propose AgentRM, a generalizable reward model, to guide the policy model for effective test-time search.
no code implementations • 17 Feb 2025 • Haoxuan Li, Jifan Yu, Xin Cong, Yang Dang, Daniel Zhang-li, Yisi Zhan, Huiqin Liu, Zhiyuan Liu
Metacognitive education plays a crucial role in cultivating students' self-regulation and reflective thinking, providing essential support for those with learning difficulties through academic advising.
1 code implementation • 8 Nov 2024 • Shengda Fan, Xin Cong, Yuepeng Fu, Zhong Zhang, Shuyan Zhang, Yuanwei Liu, Yesai Wu, Yankai Lin, Zhiyuan Liu, Maosong Sun
Finally, we merge the synthetic samples that pass quality confirmation with the collected samples to obtain the WorkflowBench.
1 code implementation • 18 Oct 2024 • Runchu Tian, Yanghao Li, Yuepeng Fu, Siyang Deng, Qinyu Luo, Cheng Qian, Shuo Wang, Xin Cong, Zhong Zhang, Yesai Wu, Yankai Lin, Huadong Wang, Xiaojiang Liu
These experiments reveal that while most current models are robust against the "lost in the middle" issue, there exist significant biases related to the spacing of relevant information pieces.
1 code implementation • 16 Oct 2024 • Yaxi Lu, Shenzhi Yang, Cheng Qian, Guirong Chen, Qinyu Luo, Yesai Wu, Huadong Wang, Xin Cong, Zhong Zhang, Yankai Lin, Weiwen Liu, Yasheng Wang, Zhiyuan Liu, Fangming Liu, Maosong Sun
The labeled data is used to train a reward model that simulates human judgment and serves as an automatic evaluator of the proactiveness of LLM agents.
1 code implementation • 9 Oct 2024 • Guoxin Chen, Zhong Zhang, Xin Cong, Fangda Guo, Yesai Wu, Yankai Lin, Wenzheng Feng, Yasheng Wang
Tool learning enables large language models (LLMs) to interact with external tools and APIs, greatly expanding the application scope of LLMs.
1 code implementation • 26 Feb 2024 • Qinyu Luo, Yining Ye, Shihao Liang, Zhong Zhang, Yujia Qin, Yaxi Lu, Yesai Wu, Xin Cong, Yankai Lin, Yingli Zhang, Xiaoyin Che, Zhiyuan Liu, Maosong Sun
Generative models have demonstrated considerable potential in software engineering, particularly in tasks such as code generation and debugging.
1 code implementation • 18 Feb 2024 • Zhiyu Yang, Zihan Zhou, Shuo Wang, Xin Cong, Xu Han, Yukun Yan, Zhenghao Liu, Zhixing Tan, Pengyuan Liu, Dong Yu, Zhiyuan Liu, Xiaodong Shi, Maosong Sun
Scientific data visualization plays a crucial role in research by enabling the direct display of complex information and assisting researchers in identifying implicit patterns.
1 code implementation • 14 Feb 2024 • Cheng Qian, Bingxiang He, Zhong Zhuang, Jia Deng, Yujia Qin, Xin Cong, Zhong Zhang, Jie zhou, Yankai Lin, Zhiyuan Liu, Maosong Sun
Current language model-driven agents often lack mechanisms for effective user participation, which is crucial given the vagueness commonly found in user instructions.
1 code implementation • 5 Feb 2024 • Junjie Fang, Likai Tang, Hongzhe Bi, Yujia Qin, Si Sun, Zhenyu Li, Haolun Li, Yongjian Li, Xin Cong, Yankai Lin, Yukun Yan, Xiaodong Shi, Sen Song, Zhiyuan Liu, Maosong Sun
Distinguished by its four core dimensions-Memory Management, Memory Writing, Memory Reading, and Memory Injection, UniMem empowers researchers to conduct systematic exploration of long-context methods.
no code implementations • 25 Jan 2024 • Cheng Qian, Shihao Liang, Yujia Qin, Yining Ye, Xin Cong, Yankai Lin, Yesai Wu, Zhiyuan Liu, Maosong Sun
This paper introduces Investigate-Consolidate-Exploit (ICE), a novel strategy for enhancing the adaptability and flexibility of AI agents through inter-task self-evolution.
1 code implementation • 9 Jan 2024 • Runchu Tian, Yining Ye, Yujia Qin, Xin Cong, Yankai Lin, Yinxu Pan, Yesai Wu, Haotian Hui, Weichuan Liu, Zhiyuan Liu, Maosong Sun
Previous evaluations of LLMs' debugging ability are significantly limited by the risk of data leakage, the scale of the dataset, and the variety of tested bugs.
1 code implementation • 28 Dec 2023 • Chen Qian, Yufan Dang, Jiahao Li, Wei Liu, Zihao Xie, Yifei Wang, Weize Chen, Cheng Yang, Xin Cong, Xiaoyin Che, Zhiyuan Liu, Maosong Sun
Recent advancements in large language models (LLMs) have brought significant changes to various domains, especially through LLM-driven autonomous agents.
no code implementations • 28 Dec 2023 • Bohan Lyu, Xin Cong, Heyang Yu, Pan Yang, Yujia Qin, Yining Ye, Yaxi Lu, Zhong Zhang, Yukun Yan, Yankai Lin, Zhiyuan Liu, Maosong Sun
As GitHub has hosted a multitude of repositories which can be seen as a good resource for tools, a promising solution is that LLM-based agents can autonomously integrate the repositories in GitHub according to the user queries to extend their tool set.
1 code implementation • 2 Nov 2023 • Yining Ye, Xin Cong, Shizuo Tian, Jiannan Cao, Hao Wang, Yujia Qin, Yaxi Lu, Heyang Yu, Huadong Wang, Yankai Lin, Zhiyuan Liu, Maosong Sun
Empirical experiments are conducted to detail its construction and execution procedure of workflow, showcasing the feasibility of APA, unveiling the possibility of a new paradigm of automation driven by agents.
no code implementations • 24 Aug 2023 • Yining Ye, Xin Cong, Shizuo Tian, Yujia Qin, Chong Liu, Yankai Lin, Zhiyuan Liu, Maosong Sun
Central to the development of rationality is the construction of an internalized utility judgment, capable of assigning numerical utilities to each decision.
1 code implementation • 21 Aug 2023 • Weize Chen, Yusheng Su, Jingwei Zuo, Cheng Yang, Chenfei Yuan, Chi-Min Chan, Heyang Yu, Yaxi Lu, Yi-Hsin Hung, Chen Qian, Yujia Qin, Xin Cong, Ruobing Xie, Zhiyuan Liu, Maosong Sun, Jie zhou
Autonomous agents empowered by Large Language Models (LLMs) have undergone significant improvements, enabling them to generalize across a broad spectrum of tasks.
no code implementations • 4 Aug 2023 • Shiyao Cui, Xin Cong, Jiawei Sheng, Xuebin Wang, Tingwen Liu, Jinqiao Shi
In this paper, we regard public pre-trained language models as knowledge bases and automatically mine the script-related knowledge via prompt-learning.
2 code implementations • 31 Jul 2023 • Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, Sihan Zhao, Lauren Hong, Runchu Tian, Ruobing Xie, Jie zhou, Mark Gerstein, Dahai Li, Zhiyuan Liu, Maosong Sun
Based on ToolBench, we fine-tune LLaMA to obtain an LLM ToolLLaMA, and equip it with a neural API retriever to recommend appropriate APIs for each instruction.
Ranked #3 on
Trajectory Planning
on ToolBench
1 code implementation • 28 Jul 2023 • Shihao Liang, Runchu Tian, Kunlun Zhu, Yujia Qin, Huadong Wang, Xin Cong, Zhiyuan Liu, Xiaojiang Liu, Maosong Sun
Instruction tuning has emerged as a promising approach to enhancing large language models in following human instructions.
1 code implementation • 16 Jul 2023 • Chen Qian, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang, Weize Chen, Yusheng Su, Xin Cong, Juyuan Xu, Dahai Li, Zhiyuan Liu, Maosong Sun
Numerous studies used deep learning to improve specific phases in a waterfall model, such as design, coding, and testing.
3 code implementations • 17 Apr 2023 • Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Yufei Huang, Chaojun Xiao, Chi Han, Yi Ren Fung, Yusheng Su, Huadong Wang, Cheng Qian, Runchu Tian, Kunlun Zhu, Shihao Liang, Xingyu Shen, Bokai Xu, Zhen Zhang, Yining Ye, Bowen Li, Ziwei Tang, Jing Yi, Yuzhang Zhu, Zhenning Dai, Lan Yan, Xin Cong, Yaxi Lu, Weilin Zhao, Yuxiang Huang, Junxi Yan, Xu Han, Xian Sun, Dahai Li, Jason Phang, Cheng Yang, Tongshuang Wu, Heng Ji, Zhiyuan Liu, Maosong Sun
Considering the lack of a systematic tool learning evaluation in prior works, we experiment with 18 representative tools and show the potential of current foundation models in skillfully utilizing tools.
1 code implementation • 8 Apr 2023 • Jiangxia Cao, Xin Cong, Jiawei Sheng, Tingwen Liu, Bin Wang
Cross-Domain Sequential Recommendation (CDSR) aims to predict future interactions based on user's historical sequential interactions from multiple domains.
no code implementations • 5 Apr 2023 • Shiyao Cui, Jiangxia Cao, Xin Cong, Jiawei Sheng, Quangang Li, Tingwen Liu, Jinqiao Shi
For the first issue, a refinement-regularizer probes the information-bottleneck principle to balance the predictive evidence and noisy information, yielding expressive representations for prediction.
1 code implementation • COLING 2022 • Shiyao Cui, Jiawei Sheng, Xin Cong, Quangang Li, Tingwen Liu, Jinqiao Shi
Event Causality Identification (ECI), which aims to detect whether a causality relation exists between two given textual events, is an important task for event causality understanding.
1 code implementation • SIGIR 2022 • Xin Cong, Jiawei Sheng, Shiyao Cui, Bowen Yu, Tingwen Liu, Bin Wang
To instantiate this strategy, we further propose a model, RelATE, which builds a dual-level attention to aggregate relationrelevant information to detect the relation occurrence and utilizes the annotated samples of the detected relations to extract the corresponding head/tail entities.
1 code implementation • 31 Mar 2022 • Jiangxia Cao, Jiawei Sheng, Xin Cong, Tingwen Liu, Bin Wang
As a promising way, Cross-Domain Recommendation (CDR) has attracted a surge of interest, which aims to transfer the user preferences observed in the source domain to make recommendations in the target domain.
no code implementations • 7 Feb 2022 • Shiyao Cui, Xin Cong, Bowen Yu, Tingwen Liu, Yucheng Wang, Jinqiao Shi
Meanwhile, rough reading is explored in a multi-round manner to discover undetected events, thus the multi-events problem is handled.
1 code implementation • EMNLP 2021 • Pan Yang, Xin Cong, Zhenyun Sun, Xingwu Liu
Recent works introduce the label knowledge to enhance the text representation by formalizing the span extraction task into a question answering problem (QA Formalization), which achieves state-of-the-art performance.
1 code implementation • 8 Jul 2021 • Jiangxia Cao, Xixun Lin, Xin Cong, Shu Guo, Hengzhu Tang, Tingwen Liu, Bin Wang
A temporal interaction network consists of a series of chronological interactions between users and items.
1 code implementation • Findings (ACL) 2021 • Xin Cong, Shiyao Cui, Bowen Yu, Tingwen Liu, Yubin Wang, Bin Wang
Event detection tends to struggle when it needs to recognize novel event types with a few samples.
no code implementations • 3 Dec 2020 • Shiyao Cui, Bowen Yu, Xin Cong, Tingwen Liu, Quangang Li, Jinqiao Shi
A heterogeneous graph attention networks is then introduced to propagate relational message and enrich information interaction.
1 code implementation • 23 Jun 2020 • Xin Cong, Bowen Yu, Tingwen Liu, Shiyao Cui, Hengzhu Tang, Bin Wang
We first build a representation extractor to derive features for unlabeled data from the target domain (no test data is necessary) and then group them with a cluster miner.