no code implementations • Findings (ACL) 2022 • Jianhan Xu, Cenyuan Zhang, Xiaoqing Zheng, Linyang Li, Cho-Jui Hsieh, Kai-Wei Chang, Xuanjing Huang
Most of the existing defense methods improve the adversarial robustness by making the models adapt to the training set augmented with some adversarial examples.
1 code implementation • COLING 2022 • Xin Zhou, Ruotian Ma, Yicheng Zou, Xuanting Chen, Tao Gui, Qi Zhang, Xuanjing Huang, Rui Xie, Wei Wu
Specifically, we re-formulate both token and sentence classification tasks into a unified language modeling task, and map label spaces of different tasks into the same vocabulary space.
1 code implementation • EMNLP 2020 • Siyuan Wang, Zhongyu Wei, Zhihao Fan, Zengfeng Huang, Weijian Sun, Qi Zhang, Xuanjing Huang
Human evaluation also proves that our model is able to generate relevant and informative questions.
no code implementations • EMNLP 2020 • Qinzhuo Wu, Qi Zhang, Jinlan Fu, Xuanjing Huang
With the advancements in natural language processing tasks, math word problem solving has received increasing attention.
1 code implementation • COLING 2022 • Lei Chen, Guanying Li, Zhongyu Wei, Yang Yang, Baohua Zhou, Qi Zhang, Xuanjing Huang
Existing works on rumor resolution have shown great potential in recognizing word appearance and user participation.
no code implementations • COLING 2022 • Rui Zheng, Rong Bao, Qin Liu, Tao Gui, Qi Zhang, Xuanjing Huang, Rui Xie, Wei Wu
To reduce the potential side effects of using defense modules, we further propose a novel forgetting restricted adversarial training, which filters out bad adversarial examples that impair the performance of original ones.
1 code implementation • COLING 2022 • Yinzi Li, Wei Chen, Zhongyu Wei, Yujun Huang, Chujun Wang, Siyuan Wang, Qi Zhang, Xuanjing Huang, Libo Wu
Existing research for argument representation learning mainly treats tokens in the sentence equally and ignores the implied structure information of argumentative context.
no code implementations • COLING 2022 • Zhichao Geng, Ming Zhong, Zhangyue Yin, Xipeng Qiu, Xuanjing Huang
For dialogue summarization, the subdomain of text summarization, utterances are concatenated to flat text before being processed.
no code implementations • ICLR 2019 • Pengfei Liu, Xuanjing Huang
In this paper, we describe a general framework to systematically analyze current neural models for multi-task learning, in which we find that existing models expect to disentangle features into different spaces while features learned in practice are still entangled in shared space, leaving potential hazards for other training or unseen tasks.
1 code implementation • ACL 2022 • Zichu Fei, Qi Zhang, Tao Gui, Di Liang, Sirui Wang, Wei Wu, Xuanjing Huang
CQG employs a simple method to generate the multi-hop questions that contain key entities in multi-hop reasoning chains, which ensure the complexity and quality of the questions.
2 code implementations • ACL 2022 • Qin Liu, Rui Zheng, Bao Rong, Jingyi Liu, Zhihua Liu, Zhanzhan Cheng, Liang Qiao, Tao Gui, Qi Zhang, Xuanjing Huang
Adversarial robustness has attracted much attention recently, and the mainstream solution is adversarial training.
no code implementations • COLING 2022 • Zichu Fei, Xin Zhou, Tao Gui, Qi Zhang, Xuanjing Huang
Existing KBQG models still face two main challenges: (1) Most models often focus on the most relevant part of the answer entity, while neglecting the rest of the subgraph.
no code implementations • 26 Apr 2025 • Yi Lu, Wanxu Zhao, Xin Zhou, Chenxin An, Chenglong Wang, Shuo Li, Yuming Yang, Jun Zhao, Tao Ji, Tao Gui, Qi Zhang, Xuanjing Huang
Instead of manipulating all dimensions equally, DPE detects the effective length for every dimension and finds the key dimensions for context extension.
no code implementations • 19 Apr 2025 • Shihan Dou, Muling Wu, Jingwen Xu, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang
We observe that for complex problems, during the early stages of training, the model exhibits strong exploratory capabilities and can identify promising solution ideas.
1 code implementation • 14 Apr 2025 • Xinnong Zhang, Jiayu Lin, Xinyi Mou, Shiyue Yang, Xiawei Liu, Libo Sun, Hanjia Lyu, Yihang Yang, Weihong Qi, Yue Chen, Guanying Li, Ling Yan, Yao Hu, Siming Chen, Yu Wang, Xuanjing Huang, Jiebo Luo, Shiping Tang, Libo Wu, Baohua Zhou, Zhongyu Wei
Social simulation is transforming traditional social science research by modeling human behavior through interactions between virtual individuals and their environments.
1 code implementation • 10 Apr 2025 • Yixin Cao, Jiahao Ying, Yaoning Wang, Xipeng Qiu, Xuanjing Huang, Yugang Jiang
In this paper, we analyze the core limitations of traditional evaluation pipelines and propose a novel metric, the Model Utilization Index (MUI), which introduces mechanism interpretability techniques to complement traditional performance metrics.
1 code implementation • 9 Apr 2025 • Yuxin Wang, Yiran Guo, Yining Zheng, Zhangyue Yin, Shuo Chen, Jie Yang, Jiajun Chen, Xuanjing Huang, Xipeng Qiu
To bridge this gap, we introduce FamilyTool, a novel benchmark grounded in a family-based knowledge graph (KG) that simulates personalized, multi-hop tool use scenarios.
1 code implementation • 19 Mar 2025 • Shuo Li, Jiajun Sun, Guodong Zheng, Xiaoran Fan, Yujiong Shen, Yi Lu, Zhiheng Xi, Yuming Yang, Wenming Tan, Tao Ji, Tao Gui, Qi Zhang, Xuanjing Huang
Recently, multimodal large language models (MLLMs) have demonstrated remarkable performance in visual-language tasks.
1 code implementation • 9 Mar 2025 • Ming Zhang, Yuhui Wang, Yujiong Shen, Tingyi Yang, Changhao Jiang, Yilong Wu, Shihan Dou, Qinhao Chen, Zhiheng Xi, Zhihao Zhang, Yi Dong, Zhen Wang, Zhihui Fei, Mingyang Wan, Tao Liang, Guojun Ma, Qi Zhang, Tao Gui, Xuanjing Huang
Additionally, the 8B model can surpass GPT-4o up to 43. 88% with an average of 11. 00%.
no code implementations • 8 Mar 2025 • Yixin Wu, Feiran Zhang, Tianyuan Shi, Ruicheng Yin, Zhenghua Wang, Zhenliang Gan, Xiaohua Wang, Changze Lv, Xiaoqing Zheng, Xuanjing Huang
Based on this observation, we propose a novel detection method that amplifies these differences by progressively adding noise to the original images across multiple timesteps, and train an ensemble of classifiers on these noised images.
no code implementations • 6 Mar 2025 • Wenxiang Chen, wei he, Zhiheng Xi, Honglin Guo, Boyang Hong, Jiazheng Zhang, Rui Zheng, Nijun Li, Tao Gui, Yun Li, Qi Zhang, Xuanjing Huang
Process supervision, i. e., evaluating each step, is critical for complex large language model (LLM) reasoning and test-time searching with increased inference compute.
no code implementations • 6 Mar 2025 • Zhenghua Wang, Yiran Ding, Changze Lv, Zhibo Xu, Tianlong Li, Tianyuan Shi, Xiaoqing Zheng, Xuanjing Huang
Although large language models (LLMs) have achieved significant progress in handling long-context inputs, they still suffer from the ``lost-in-the-middle'' problem, where crucial information in the middle of the context is often underrepresented or lost.
1 code implementation • 3 Mar 2025 • Yuxin Wang, Botian Jiang, Yiran Guo, Quan Gan, David Wipf, Xuanjing Huang, Xipeng Qiu
Through this work, we address the challenges of efficiently handling large datasets via PFN-based models, paving the way for faster and more effective tabular data classification training and prediction process.
no code implementations • 24 Feb 2025 • Xiaoran Liu, Ruixiao Li, Mianqiu Huang, Zhigeng Liu, Yuerong Song, Qipeng Guo, Siyang He, Qiqi Wang, Linlin Li, Qun Liu, Yaqian Zhou, Xuanjing Huang, Xipeng Qiu
Moreover, the research on long-context LLMs has expanded from length extrapolation to a comprehensive focus on architecture, infrastructure, training, and evaluation technologies.
1 code implementation • 24 Feb 2025 • Qin Zhu, Fei Huang, Runyu Peng, Keming Lu, Bowen Yu, Qinyuan Cheng, Xipeng Qiu, Xuanjing Huang, Junyang Lin
While logical reasoning evaluation of Large Language Models (LLMs) has attracted significant attention, existing benchmarks predominantly rely on multiple-choice formats that are vulnerable to random guessing, leading to overestimated performance and substantial performance fluctuations.
1 code implementation • 24 Feb 2025 • Yuming Yang, Yang Nan, Junjie Ye, Shihan Dou, Xiao Wang, Shuo Li, Huijie Lv, Tao Gui, Qi Zhang, Xuanjing Huang
To address this, we systematically analyze 11 existing diversity measurement methods by assessing their correlation with model performance through extensive fine-tuning experiments.
no code implementations • 20 Feb 2025 • Zhuohang Long, Siyuan Wang, Shujun Liu, Yuhang Lai, Xuanjing Huang, Zhongyu Wei
Jailbreak attacks, where harmful prompts bypass generative models' built-in safety, raise serious concerns about model vulnerability.
no code implementations • 13 Feb 2025 • Xin Zhou, Yiwen Guo, Ruotian Ma, Tao Gui, Qi Zhang, Xuanjing Huang
Recent advancements in Self-Rewarding Language Models suggest that an LLM can use its internal reward models (such as LLM-as-a-Judge) \cite{yuanself} to generate preference data, improving alignment performance without costly human annotation.
no code implementations • 8 Feb 2025 • Shengbin Yue, Ting Huang, Zheng Jia, Siyuan Wang, Shujun Liu, Yun Song, Xuanjing Huang, Zhongyu Wei
Large Language Models (LLMs) have significantly advanced legal intelligence, but the scarcity of scenario data impedes the progress toward interactive legal scenarios.
no code implementations • 28 Jan 2025 • Changze Lv, Yansen Wang, Dongqi Han, Yifei Shen, Xiaoqing Zheng, Xuanjing Huang, Dongsheng Li
We provide comprehensive proof of the method's effectiveness in partially capturing relative positional information for sequential tasks.
no code implementations • 26 Jan 2025 • Yuhong Sun, Zhangyue Yin, Xuanjing Huang, Xipeng Qiu, Hui Zhao
To reduce human bias and enable fine-grained analysis of error patterns, we propose a novel framework for automated dynamic error classification in mathematical reasoning.
no code implementations • 21 Jan 2025 • Ningyu Xu, Qi Zhang, Chao Du, Qiang Luo, Xipeng Qiu, Xuanjing Huang, Menghan Zhang
These findings establish that structured, human-like conceptual representations can naturally emerge from language prediction without real-world grounding.
no code implementations • 17 Jan 2025 • Changze Lv, Jingwen Xu, Yiyang Lu, Xiaohua Wang, Zhenghua Wang, Zhibo Xu, Di Yu, Xin Du, Xiaoqing Zheng, Xuanjing Huang
Backpropagation is the foundational algorithm for training neural networks and a key driver of deep learning's success.
no code implementations • 5 Jan 2025 • Junjie Ye, Zhengyin Du, Xuesong Yao, Weijian Lin, Yufei Xu, Zehui Chen, Zaiyuan Wang, Sining Zhu, Zhiheng Xi, Siyu Yuan, Tao Gui, Qi Zhang, Xuanjing Huang, Jiecao Chen
Effective evaluation of multi-hop tool use is critical for analyzing the understanding, reasoning, and function-calling capabilities of large language models (LLMs).
1 code implementation • 20 Dec 2024 • Junjie Ye, Yilong Wu, Sixian Li, Yuming Yang, Tao Gui, Qi Zhang, Xuanjing Huang, Peng Wang, Zhongchao shi, Jianping Fan, Zhengyin Du
Large language models (LLMs) achieve remarkable advancements by leveraging tools to interact with external environments, a critical step toward generalized AI.
1 code implementation • 18 Dec 2024 • Zhiyuan Zeng, Qinyuan Cheng, Zhangyue Yin, Bo wang, ShiMin Li, Yunhua Zhou, Qipeng Guo, Xuanjing Huang, Xipeng Qiu
OpenAI o1 represents a significant milestone in Artificial Inteiligence, which achieves expert-level performances on many challanging tasks that require strong reasoning ability. OpenAI has claimed that the main techinique behinds o1 is the reinforcement learining.
no code implementations • 18 Dec 2024 • Wei Tang, Yixin Cao, Yang Deng, Jiahao Ying, Bo wang, Yizhe Yang, Yuyue Zhao, Qi Zhang, Xuanjing Huang, Yugang Jiang, Yong Liao
Knowledge utilization is a critical aspect of LLMs, and understanding how they adapt to evolving knowledge is essential for their effective deployment.
1 code implementation • 17 Dec 2024 • Jianing He, Qi Zhang, Hongyun Zhang, Xuanjing Huang, Usman Naseem, Duoqian Miao
Early exiting is an effective paradigm for improving the inference efficiency of pre-trained language models (PLMs) by dynamically adjusting the number of executed layers for each sample.
no code implementations • 17 Dec 2024 • Siyuan Wang, Dianyi Wang, Chengxing Zhou, Zejun Li, Zhihao Fan, Xuanjing Huang, Zhongyu Wei
Large Vision-Language Models (LVLMs) typically learn visual capacity through visual instruction tuning, involving updates to both a projector and their LLM backbones.
1 code implementation • 4 Dec 2024 • Xinyi Mou, Xuanwen Ding, Qi He, Liang Wang, Jingcong Liang, Xinnong Zhang, Libo Sun, Jiayu Lin, Jie zhou, Xuanjing Huang, Zhongyu Wei
We categorize the simulations into three types: (1) Individual Simulation, which mimics specific individuals or demographic groups; (2) Scenario Simulation, where multiple agents collaborate to achieve goals within specific contexts; and (3) Society Simulation, which models interactions within agent societies to reflect the complexity and variety of real-world dynamics.
no code implementations • 29 Nov 2024 • Xuhao Hu, Dongrui Liu, Hao Li, Xuanjing Huang, Jing Shao
To explain such a counter-intuitive phenomenon, we discover a visual safety information leakage (VSIL) problem in existing multimodal safety benchmarks, i. e., the potentially risky and sensitive content in the image has been revealed in the textual query.
no code implementations • 28 Nov 2024 • Yuqian Fu, Runze Wang, Yanwei Fu, Danda Pani Paudel, Xuanjing Huang, Luc van Gool
In this paper, we focus on the Ego-Exo Object Correspondence task, an emerging challenge in the field of computer vision that aims to map objects across ego-centric and exo-centric views.
no code implementations • 25 Nov 2024 • Zhiheng Xi, Dingwen Yang, Jixuan Huang, Jiafu Tang, Guanyu Li, Yiwen Ding, wei he, Boyang Hong, Shihan Do, WenYu Zhan, Xiao Wang, Rui Zheng, Tao Ji, Xiaowei Shi, Yitao Zhai, Rongxiang Weng, Jingang Wang, Xunliang Cai, Tao Gui, Zuxuan Wu, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Yu-Gang Jiang
Experiments show that the method improves the actor's exploration efficiency and solution diversity, especially on challenging queries, leading to a stronger reasoning model.
1 code implementation • 21 Nov 2024 • Zhijie Bao, Qingyun Liu, Ying Guo, Zhengqiang Ye, Jun Shen, Shirong Xie, Jiajie Peng, Xuanjing Huang, Zhongyu Wei
This system integrates an LLM-based reception nurse and a collaboration between LLM and hospital information system (HIS) into real outpatient reception setting, aiming to deliver personalized, high-quality, and efficient reception services.
no code implementations • 11 Nov 2024 • Mianqiu Huang, Xiaoran Liu, Shaojun Zhou, Mozhi Zhang, Chenkun Tan, Pengyu Wang, Qipeng Guo, Zhe Xu, Linyang Li, Zhikai Lei, Linlin Li, Qun Liu, Yaqian Zhou, Xipeng Qiu, Xuanjing Huang
With the development of large language models (LLMs), the sequence length of these models continues to increase, drawing significant attention to long-context language models.
1 code implementation • 1 Nov 2024 • Yiwen Ding, Zhiheng Xi, wei he, Zhuoyuan Li, Yitao Zhai, Xiaowei Shi, Xunliang Cai, Tao Gui, Qi Zhang, Xuanjing Huang
Self-improvement methods enable large language models (LLMs) to generate solutions themselves and iteratively train on filtered, high-quality rationales.
1 code implementation • 30 Oct 2024 • Shihan Dou, Jiazheng Zhang, Jianxiang Zang, Yunbo Tao, Weikang Zhou, Haoxiang Jia, Shichun Liu, Yuming Yang, Zhiheng Xi, Shenxi Wu, Shaoqing Zhang, Muling Wu, Changze Lv, Limao Xiong, WenYu Zhan, Lin Zhang, Rongxiang Weng, Jingang Wang, Xunliang Cai, Yueming Wu, Ming Wen, Rui Zheng, Tao Ji, Yixin Cao, Tao Gui, Xipeng Qiu, Qi Zhang, Xuanjing Huang
We introduce MPLSandbox, an out-of-the-box multi-programming language sandbox designed to provide unified and comprehensive feedback from compiler and analysis tools for Large Language Models (LLMs).
no code implementations • 28 Oct 2024 • Xinnong Zhang, Jiayu Lin, Libo Sun, Weihong Qi, Yihang Yang, Yue Chen, Hanjia Lyu, Xinyi Mou, Siming Chen, Jiebo Luo, Xuanjing Huang, Shiping Tang, Zhongyu Wei
We also introduce PPE, a poll-based presidential election benchmark to assess the performance of our framework under the U. S. presidential election scenario.
1 code implementation • 27 Oct 2024 • Zhengfu He, Wentao Shu, Xuyang Ge, Lingjie Chen, Junxuan Wang, Yunhua Zhou, Frances Liu, Qipeng Guo, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang, Xipeng Qiu
We introduce a suite of 256 SAEs, trained on each layer and sublayer of the Llama-3. 1-8B-Base model, with 32K and 128K features.
1 code implementation • 25 Oct 2024 • Xinyi Mou, Jingcong Liang, Jiayu Lin, Xinnong Zhang, Xiawei Liu, Shiyue Yang, Rong Ye, Lei Chen, Haoyu Kuang, Xuanjing Huang, Zhongyu Wei
To this end, we introduce AgentSense: Benchmarking Social Intelligence of Language Agents through Interactive Scenarios.
2 code implementations • 24 Oct 2024 • wei he, Zhiheng Xi, Wanxu Zhao, Xiaoran Fan, Yiwen Ding, Zifei Shan, Tao Gui, Qi Zhang, Xuanjing Huang
Specifically, we employ text-based synthesizing techniques to construct chart-plotting code and produce ReachQA, a dataset containing 3k reasoning-intensive charts and 20k Q&A pairs to enhance both recognition and reasoning abilities.
no code implementations • 20 Oct 2024 • Xin Zhou, Ping Nie, Yiwen Guo, Haojie Wei, Zhanqiu Zhang, Pasquale Minervini, Ruotian Ma, Tao Gui, Qi Zhang, Xuanjing Huang
In this paper, we aim to investigate these internal mechanisms within the popular Mixture-of-Expert (MoE)-based LLMs and demonstrate how to improve RAG by examining expert activations in these LLMs.
no code implementations • 15 Oct 2024 • Shuo Li, Tao Ji, Xiaoran Fan, Linsheng Lu, Leyi Yang, Yuming Yang, Zhiheng Xi, Rui Zheng, Yuran Wang, Xiaohui Zhao, Tao Gui, Qi Zhang, Xuanjing Huang
Our findings indicate that the ability to prevent sycophancy is predominantly observed in higher layers of the model.
1 code implementation • 13 Oct 2024 • Enyu Zhou, Guodong Zheng, Binghai Wang, Zhiheng Xi, Shihan Dou, Rong Bao, Wei Shen, Limao Xiong, Jessica Fan, Yurong Mou, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang
However, the current evaluation of RMs may not directly correspond to their alignment performance due to the limited distribution of evaluation data and evaluation methods that are not closely related to alignment objectives.
no code implementations • 10 Oct 2024 • Xiawei Liu, Shiyue Yang, Xinnong Zhang, Haoyu Kuang, Libo Sun, Yihang Yang, Siming Chen, Xuanjing Huang, Zhongyu Wei
The rise of various social platforms has transformed journalism.
no code implementations • 25 Sep 2024 • Wenhao Liu, Siyu An, Junru Lu, Muling Wu, Tianlong Li, Xiaohua Wang, Xiaoqing Zheng, Di Yin, Xing Sun, Xuanjing Huang
To investigate RPAs' performance when faced with different types of conflicting requests, we develop an evaluation benchmark that includes contextual knowledge conflicting requests, parametric knowledge conflicting requests, and non-conflicting requests to assess RPAs' ability to identify conflicts and refuse to answer appropriately without over-refusing.
no code implementations • 24 Sep 2024 • Junjie Ye, Yuming Yang, Qi Zhang, Tao Gui, Xuanjing Huang, Peng Wang, Zhongchao shi, Jianping Fan
Large language models (LLMs) encode extensive world knowledge through pre-training on massive datasets, which can then be fine-tuned for the question-answering (QA) task.
no code implementations • 4 Sep 2024 • Zhe Xu, Jiasheng Ye, Xiangyang Liu, Tianxiang Sun, Xiaoran Liu, Qipeng Guo, Linlin Li, Qun Liu, Xuanjing Huang, Xipeng Qiu
DetectiveQA focuses on evaluating the long-context reasoning ability of LLMs, which not only requires a full understanding of context but also requires extracting important evidences from the context and reasoning according to extracted evidences to answer the given questions.
no code implementations • 27 Aug 2024 • Han Xia, Songyang Gao, Qiming Ge, Zhiheng Xi, Qi Zhang, Xuanjing Huang
Reinforcement Learning from Human Feedback (RLHF) has proven effective in aligning large language models with human intentions, yet it often relies on complex methodologies like Proximal Policy Optimization (PPO) that require extensive hyper-parameter tuning and present challenges in sample efficiency and stability.
1 code implementation • 31 Jul 2024 • Ming Zhang, Caishuang Huang, Yilong Wu, Shichun Liu, Huiyuan Zheng, Yurui Dong, Yujiong Shen, Shihan Dou, Jun Zhao, Junjie Ye, Qi Zhang, Tao Gui, Xuanjing Huang
Task-oriented dialogue (TOD) systems aim to efficiently handle task-oriented conversations, including information collection.
no code implementations • 28 Jul 2024 • Libo Sun, Siyuan Wang, Xuanjing Huang, Zhongyu Wei
Utilizing large language models (LLMs) to achieve role-playing has gained great attention recently.
no code implementations • 20 Jul 2024 • Jiayu Lin, Guanrong Chen, Bojun Jin, Chenyang Li, Shutong Jia, Wancong Lin, Yang Sun, Yuhang He, Caihua Yang, Jianzhu Bao, Jipeng Wu, Wen Su, Jinglu Chen, Xinyi Li, Tianyu Chen, Mingjie Han, Shuaiwen Du, Zijian Wang, Jiyin Li, Fuzhong Suo, Hao Wang, Nuanchen Lin, Xuanjing Huang, Changjian Jiang, Ruifeng Xu, Long Zhang, Jiuxin Cao, Ting Jin, Zhongyu Wei
In this paper we present the results of the AI-Debater 2023 Challenge held by the Chinese Conference on Affect Computing (CCAC 2023), and introduce the related datasets.
1 code implementation • 17 Jul 2024 • Yunfan Shao, Linyang Li, Yichuan Ma, Peiji Li, Demin Song, Qinyuan Cheng, ShiMin Li, Xiaonan Li, Pengyu Wang, Qipeng Guo, Hang Yan, Xipeng Qiu, Xuanjing Huang, Dahua Lin
In this paper, we hope to focus on evaluating and teaching LLMs to conduct inductive reasoning, that is, LLMs are supposed to infer underlying rules by observing examples or sequential transformations.
1 code implementation • 13 Jul 2024 • Shengbin Yue, Siyuan Wang, Wei Chen, Xuanjing Huang, Zhongyu Wei
Recent advancements in Large Language Models (LLMs) have led to significant breakthroughs in various natural language processing tasks.
no code implementations • 8 Jul 2024 • Shihan Dou, Haoxiang Jia, Shenxi Wu, Huiyuan Zheng, Weikang Zhou, Muling Wu, Mingxu Chai, Jessica Fan, Caishuang Huang, Yunbo Tao, Yan Liu, Enyu Zhou, Ming Zhang, Yuhao Zhou, Yueming Wu, Rui Zheng, Ming Wen, Rongxiang Weng, Jingang Wang, Xunliang Cai, Tao Gui, Xipeng Qiu, Qi Zhang, Xuanjing Huang
The increasing development of large language models (LLMs) in code generation has drawn significant attention among researchers.
no code implementations • 4 Jul 2024 • Shujun Liu, Xiaoyu Shen, Yuhang Lai, Siyuan Wang, Shengbin Yue, Zengfeng Huang, Xuanjing Huang, Zhongyu Wei
By decoupling the reward modeling procedure and incorporating hybrid supervision, our HaF-RM framework offers a principled and effective approach to enhancing the performance and alignment of reward models, a critical component in the responsible development of powerful language models.
1 code implementation • 1 Jul 2024 • Xiaohua Wang, Zhenghua Wang, Xuan Gao, Feiran Zhang, Yixin Wu, Zhibo Xu, Tianyuan Shi, Zhengyuan Wang, Shizheng Li, Qi Qian, Ruicheng Yin, Changze Lv, Xiaoqing Zheng, Xuanjing Huang
Retrieval-augmented generation (RAG) techniques have proven to be effective in integrating up-to-date information, mitigating hallucinations, and enhancing response quality, particularly in specialized domains.
1 code implementation • 1 Jul 2024 • Zisu Huang, Xiaohua Wang, Feiran Zhang, Zhibo Xu, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang
This strategy improves the quality of the queries, empowering LLMs to generate more truthful, benign and useful responses.
no code implementations • 29 Jun 2024 • Jiahao Ying, Mingbao Lin, Yixin Cao, Wei Tang, Bo wang, Qianru Sun, Xuanjing Huang, Shuicheng Yan
Inspired by the theory of "Learning from Errors", this framework employs an instructor LLM to meticulously analyze the specific errors within a target model, facilitating targeted and efficient training cycles.
1 code implementation • 26 Jun 2024 • Caishuang Huang, Wanxu Zhao, Rui Zheng, Huijie Lv, WenYu Zhan, Shihan Dou, Sixian Li, Xiao Wang, Enyu Zhou, Junjie Ye, Yuming Yang, Tao Gui, Qi Zhang, Xuanjing Huang
As the development of large language models (LLMs) rapidly advances, securing these models effectively without compromising their utility has become a pivotal area of research.
no code implementations • 24 Jun 2024 • Mianxin Liu, Jinru Ding, Jie Xu, Weiguo Hu, Xiaoyang Li, Lifeng Zhu, Zhian Bai, Xiaoming Shi, Benyou Wang, Haitao Song, PengFei Liu, Xiaofan Zhang, Shanshan Wang, Kang Li, Haofen Wang, Tong Ruan, Xuanjing Huang, Xin Sun, Shaoting Zhang
In this work, we introduce "MedBench", a comprehensive, standardized, and reliable benchmarking system for Chinese medical LLM.
1 code implementation • 22 Jun 2024 • Xingyu Lu, Xiaonan Li, Qinyuan Cheng, Kai Ding, Xuanjing Huang, Xipeng Qiu
We find that LLMs' fact knowledge capacity has a linear and negative exponential law relationship with model size and training epochs, respectively.
1 code implementation • 21 Jun 2024 • Siyin Wang, Xingsong Ye, Qinyuan Cheng, Junwen Duan, ShiMin Li, Jinlan Fu, Xipeng Qiu, Xuanjing Huang
As Artificial General Intelligence (AGI) becomes increasingly integrated into various facets of human life, ensuring the safety and ethical alignment of such systems is paramount.
1 code implementation • 20 Jun 2024 • Qin Zhu, Qingyuan Cheng, Runyu Peng, Xiaonan Li, Tengxiao Liu, Ru Peng, Xipeng Qiu, Xuanjing Huang
On MMLU, using Inference-time Decontamination can lead to a decrease in the results of Phi3 and Mistral by 6. 7% and 3. 6% respectively.
no code implementations • 20 Jun 2024 • Jingcong Liang, Junlong Wang, Xinyu Zhai, Yungui Zhuang, Yiyang Zheng, Xin Xu, Xiandong Ran, Xiaozheng Dong, Honghui Rong, Yanlun Liu, Hao Chen, Yuhan Wei, Donghai Li, Jiajie Peng, Xuanjing Huang, Chongde Shi, Yansong Feng, Yun Song, Zhongyu Wei
We give a detailed overview of the CAIL 2023 Argument Mining Track, one of the Chinese AI and Law Challenge (CAIL) 2023 tracks.
1 code implementation • 17 Jun 2024 • Yongting Zhang, Lu Chen, Guodong Zheng, Yifeng Gao, Rui Zheng, Jinlan Fu, Zhenfei Yin, Senjie Jin, Yu Qiao, Xuanjing Huang, Feng Zhao, Tao Gui, Jing Shao
To address these limitations, we propose a Safety Preference Alignment dataset for Vision Language Models named SPA-VL.
1 code implementation • 17 Jun 2024 • Yuming Yang, Wantong Zhao, Caishuang Huang, Junjie Ye, Xiao Wang, Huiyuan Zheng, Yang Nan, Yuran Wang, Xueying Xu, Kaixin Huang, Yunke Zhang, Tao Gui, Qi Zhang, Xuanjing Huang
First, we detect inconsistent entity definitions across datasets and clarify them by distinguishable label names to construct a universal taxonomy of 400+ entity types.
no code implementations • 16 Jun 2024 • Jianhao Zhu, Changze Lv, Xiaohua Wang, Muling Wu, Wenhao Liu, Tianlong Li, Zixuan Ling, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang
Conventional federated learning primarily aims to secure the privacy of data distributed across multiple edge devices, with the global model dispatched to edge devices for parameter updates during the learning process.
1 code implementation • 16 Jun 2024 • Rui Zheng, Hongyi Guo, Zhihan Liu, Xiaoying Zhang, Yuanshun Yao, Xiaojun Xu, Zhaoran Wang, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang, Hang Li, Yang Liu
We theoretically demonstrate that this iterative reinforcement learning optimization converges to a Nash Equilibrium for the game induced by the agents.
1 code implementation • 9 Jun 2024 • Mengfei Du, Binhao Wu, Zejun Li, Xuanjing Huang, Zhongyu Wei
The recent rapid development of Large Vision-Language Models (LVLMs) has indicated their potential for embodied tasks. However, the critical skill of spatial understanding in embodied environments has not been thoroughly evaluated, leaving the gap between current LVLMs and qualified embodied intelligence unknown.
1 code implementation • 6 Jun 2024 • Zhiheng Xi, Yiwen Ding, Wenxiang Chen, Boyang Hong, Honglin Guo, Junzhe Wang, Dingwen Yang, Chenyang Liao, Xin Guo, wei he, Songyang Gao, Lu Chen, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang
Building generalist agents that can handle diverse tasks and evolve themselves across different environments is a long-term goal in the AI community.
1 code implementation • 23 May 2024 • Changze Lv, Dongqi Han, Yansen Wang, Xiaoqing Zheng, Xuanjing Huang, Dongsheng Li
Spiking neural networks (SNNs) represent a promising approach to developing artificial neural networks that are both energy-efficient and biologically plausible.
1 code implementation • 21 May 2024 • Zhangyue Yin, Qiushi Sun, Qipeng Guo, Zhiyuan Zeng, Xiaonan Li, Tianxiang Sun, Cheng Chang, Qinyuan Cheng, Ding Wang, Xiaofeng Mou, Xipeng Qiu, Xuanjing Huang
Recent advancements in Chain-of-Thought prompting have facilitated significant breakthroughs for Large Language Models (LLMs) in complex reasoning tasks.
no code implementations • 15 May 2024 • Qi Jia, Baoyu Fan, Cong Xu, Lu Liu, Liang Jin, Guoguang Du, Zhenhua Guo, YaQian Zhao, Xuanjing Huang, RenGang Li
In light of this, we introduces a novel research task, Multi-modal Sentiment Analysis for Comment Response of Video Induced(MSA-CRVI), aims to inferring opinions and emotions according to the comments response to micro video.
1 code implementation • 5 May 2024 • Jun Zhao, Jingqi Tong, Yurong Mou, Ming Zhang, Qi Zhang, Xuanjing Huang
In this work, we investigate the compositionality of large language models (LLMs) in mathematical reasoning.
no code implementations • 1 May 2024 • Shihan Dou, Yan Liu, Enyu Zhou, Tianlong Li, Haoxiang Jia, Limao Xiong, Xin Zhao, Junjie Ye, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang
These two issues can be united as a challenge posed by the shifted distribution of the environment.
1 code implementation • 18 Apr 2024 • Jie Wang, Tao Ji, Yuanbin Wu, Hang Yan, Tao Gui, Qi Zhang, Xuanjing Huang, Xiaoling Wang
Generalizing to longer sentences is important for recent Transformer-based language models.
1 code implementation • 2 Apr 2024 • Mengfei Du, Binhao Wu, Jiwen Zhang, Zhihao Fan, Zejun Li, Ruipu Luo, Xuanjing Huang, Zhongyu Wei
For task completion, the agent needs to align and integrate various navigation modalities, including instruction, observation and navigation history.
1 code implementation • 1 Apr 2024 • wei he, Shichun Liu, Jun Zhao, Yiwen Ding, Yi Lu, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang
The generated demos strategically interpolate between existing demos and the given query, transforming the query from OOD to ID.
no code implementations • 24 Mar 2024 • Rui Zheng, Yuhao Zhou, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang
We first empirically show that the features of either clean signals or adversarial perturbations are redundant and span in low-dimensional linear subspaces respectively with minimal overlap, and the classical low-dimensional subspace projection can suppress perturbation features out of the subspace of clean signals.
1 code implementation • 18 Mar 2024 • Weikang Zhou, Xiao Wang, Limao Xiong, Han Xia, Yingshuang Gu, Mingxu Chai, Fukang Zhu, Caishuang Huang, Shihan Dou, Zhiheng Xi, Rui Zheng, Songyang Gao, Yicheng Zou, Hang Yan, Yifan Le, Ruohui Wang, Lijun Li, Jing Shao, Tao Gui, Qi Zhang, Xuanjing Huang
This paper introduces EasyJailbreak, a unified framework simplifying the construction and evaluation of jailbreak attacks against LLMs.
no code implementations • 17 Mar 2024 • Cenyuan Zhang, Xiaoqing Zheng, Ruicheng Yin, Shujie Geng, Jianhan Xu, Xuan Gao, Changze Lv, Zixuan Ling, Xuanjing Huang, Miao Cao, Jianfeng Feng
Deciphering natural language from brain activity through non-invasive devices remains a formidable challenge.
2 code implementations • 12 Mar 2024 • Jingcong Liang, Rong Ye, Meng Han, Ruofei Lai, Xinyu Zhang, Xuanjing Huang, Zhongyu Wei
How can we construct an automated debate judge to evaluate an extensive, vibrant, multi-turn debate?
1 code implementation • 11 Mar 2024 • Yuhang Lai, Siyuan Wang, Shujun Liu, Xuanjing Huang, Zhongyu Wei
We introduce ALaRM, the first framework modeling hierarchical rewards in reinforcement learning from human feedback (RLHF), which is designed to enhance the alignment of large language models (LLMs) with human preferences.
1 code implementation • 26 Feb 2024 • Xinyi Mou, Zhongyu Wei, Xuanjing Huang
Simulating the response of the public and forecasting the potential impact has become increasingly important.
1 code implementation • 26 Feb 2024 • Huijie Lv, Xiao Wang, Yuansen Zhang, Caishuang Huang, Shihan Dou, Junjie Ye, Tao Gui, Qi Zhang, Xuanjing Huang
Adversarial misuse, particularly through `jailbreaking' that circumvents a model's safety and ethical protocols, poses a significant challenge for Large Language Models (LLMs).
no code implementations • 26 Feb 2024 • Yuansen Zhang, Xiao Wang, Zhiheng Xi, Han Xia, Tao Gui, Qi Zhang, Xuanjing Huang
In this paper, drawing inspiration from recent works that LLMs are sensitive to the design of the instructions, we utilize instructions in code style, which are more structural and less ambiguous, to replace typically natural language instructions.
2 code implementations • 23 Feb 2024 • Muling Wu, Wenhao Liu, Xiaohua Wang, Tianlong Li, Changze Lv, Zixuan Ling, Jianhao Zhu, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang
Parameter Efficient Fine-Tuning (PEFT) techniques have drawn significant attention due to their ability to yield competitive results while updating only a small portion of the adjustable parameters.
no code implementations • 22 Feb 2024 • Junjie Ye, Nuo Xu, Yikun Wang, Jie zhou, Qi Zhang, Tao Gui, Xuanjing Huang
To overcome the limitations of existing data augmentation methods that compromise semantic integrity and address the uncertainty inherent in LLM-generated text, we leverage the distinctive characteristics of the NER task by augmenting the original data at both the contextual and entity levels.
no code implementations • 22 Feb 2024 • Siyin Wang, Jie zhou, Qin Chen, Qi Zhang, Tao Gui, Xuanjing Huang
Domain adaption has been widely adapted for cross-domain sentiment analysis to transfer knowledge from the source domain to the target domain.
1 code implementation • 22 Feb 2024 • Zhihao Zhang, Jun Zhao, Qi Zhang, Tao Gui, Xuanjing Huang
Furthermore, this core region exhibits significant dimensional dependence, perturbations to even a single parameter on specific dimensions leading to a loss of linguistic competence.
1 code implementation • 22 Feb 2024 • Ningyu Xu, Qi Zhang, Menghan Zhang, Peng Qian, Xuanjing Huang
Here we re-purpose the reverse dictionary task as a case study to probe LLMs' capacity for conceptual inference.
no code implementations • 20 Feb 2024 • Xinnong Zhang, Haoyu Kuang, Xinyi Mou, Hanjia Lyu, Kun Wu, Siming Chen, Jiebo Luo, Xuanjing Huang, Zhongyu Wei
The powerful Large Vision Language Models make it possible to handle a variety of tasks simultaneously, but even with carefully designed prompting methods, the general domain models often fall short in aligning with the unique speaking style and context of social media tasks.
no code implementations • 19 Feb 2024 • Jiahao Ying, Yixin Cao, Yushi Bai, Qianru Sun, Bo wang, Wei Tang, Zhaojun Ding, Yizhe Yang, Xuanjing Huang, Shuicheng Yan
There are two updating strategies: 1) mimicking strategy to generate similar samples based on original data, preserving stylistic and contextual essence, and 2) extending strategy that further expands existing samples at varying cognitive levels by adapting Bloom's taxonomy of educational objectives.
1 code implementation • 18 Feb 2024 • Jun Zhao, Can Zu, Hao Xu, Yi Lu, wei he, Yiwen Ding, Tao Gui, Qi Zhang, Xuanjing Huang
Large language models (LLMs) have demonstrated impressive performance in understanding language and executing complex reasoning tasks.
no code implementations • 18 Feb 2024 • Nuo Xu, Jun Zhao, Can Zu, Sixian Li, Lu Chen, Zhihao Zhang, Rui Zheng, Shihan Dou, Wenjuan Qin, Tao Gui, Qi Zhang, Xuanjing Huang
To address this issue, we propose a cost-effective preference learning strategy, optimizing reward models by distinguishing between human and machine translations.
1 code implementation • 18 Feb 2024 • Siyuan Wang, Zhuohan Long, Zhihao Fan, Zhongyu Wei, Xuanjing Huang
Towards a more scalable, robust and fine-grained evaluation, we implement six reframing operations to construct evolving instances testing LLMs against diverse queries, data noise and probing their problem-solving sub-abilities.
no code implementations • 17 Feb 2024 • Siyin Wang, ShiMin Li, Tianxiang Sun, Jinlan Fu, Qinyuan Cheng, Jiasheng Ye, Junjie Ye, Xipeng Qiu, Xuanjing Huang
HAG extends the current paradigm in the text generation process, highlighting the feasibility of endowing the LLMs with self-regulate decoding strategies.
1 code implementation • 16 Feb 2024 • Junjie Ye, Sixian Li, Guanyu Li, Caishuang Huang, Songyang Gao, Yilong Wu, Qi Zhang, Tao Gui, Xuanjing Huang
Tool learning is widely acknowledged as a foundational approach or deploying large language models (LLMs) in real-world scenarios.
1 code implementation • 16 Feb 2024 • Yi Lu, Xin Zhou, wei he, Jun Zhao, Tao Ji, Tao Gui, Qi Zhang, Xuanjing Huang
Instead of allowing each head to attend to the full sentence, which struggles with generalizing to longer sequences due to out-of-distribution (OOD) issues, we allow each head to process in-distribution length by selecting and attending to important context chunks.
1 code implementation • 8 Feb 2024 • Zhiheng Xi, Wenxiang Chen, Boyang Hong, Senjie Jin, Rui Zheng, wei he, Yiwen Ding, Shichun Liu, Xin Guo, Junzhe Wang, Honglin Guo, Wei Shen, Xiaoran Fan, Yuhao Zhou, Shihan Dou, Xiao Wang, Xinbo Zhang, Peng Sun, Tao Gui, Qi Zhang, Xuanjing Huang
In this paper, we propose R$^3$: Learning Reasoning through Reverse Curriculum Reinforcement Learning (RL), a novel method that employs only outcome supervision to achieve the benefits of process supervision for large language models.
1 code implementation • 3 Feb 2024 • Ruotian Ma, Xiaolei Wang, Xin Zhou, Jian Li, Nan Du, Tao Gui, Qi Zhang, Xuanjing Huang
Despite the success, the underlying mechanism of this approach remains unexplored, and the true effectiveness of LLMs as Prompt Optimizers requires further validation.
1 code implementation • 2 Feb 2024 • Shihan Dou, Yan Liu, Haoxiang Jia, Limao Xiong, Enyu Zhou, Wei Shen, Junjie Shan, Caishuang Huang, Xiao Wang, Xiaoran Fan, Zhiheng Xi, Yuhao Zhou, Tao Ji, Rui Zheng, Qi Zhang, Xuanjing Huang, Tao Gui
The advancement of large language models (LLMs) has significantly propelled the field of code generation.
1 code implementation • 2 Feb 2024 • Changze Lv, Yansen Wang, Dongqi Han, Xiaoqing Zheng, Xuanjing Huang, Dongsheng Li
In this paper, we propose a framework for SNNs in time-series forecasting tasks, leveraging the efficiency of spiking neurons in processing temporal information.
no code implementations • 31 Jan 2024 • Wei Chen, Hengxu Lin, Qun Zhang, Xiaojin Zhang, Xiang Bai, Xuanjing Huang, Zhongyu Wei
Emotional Support Conversation aims at reducing the seeker's emotional distress through supportive response.
no code implementations • 31 Jan 2024 • Chenyu Shi, Xiao Wang, Qiming Ge, Songyang Gao, Xianjun Yang, Tao Gui, Qi Zhang, Xuanjing Huang, Xun Zhao, Dahua Lin
Large language models are meticulously aligned to be both helpful and harmless.
1 code implementation • 30 Jan 2024 • Xiaoran Fan, Tao Ji, Changhao Jiang, Shuo Li, Senjie Jin, Sirui Song, Junke Wang, Boyang Hong, Lu Chen, Guodong Zheng, Ming Zhang, Caishuang Huang, Rui Zheng, Zhiheng Xi, Yuhao Zhou, Shihan Dou, Junjie Ye, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang
This technique introduces a fusion network to unify the processing of outputs from different visual experts, while bridging the gap between image encoders and pre-trained LLMs.
Ranked #124 on
Visual Question Answering
on MM-Vet
2 code implementations • 26 Jan 2024 • Yu Sun, Keyu Chen, Shujie Wang, Peiji Li, Qipeng Guo, Hang Yan, Xipeng Qiu, Xuanjing Huang, Dahua Lin
However, these evaluation benchmarks are limited to assessing the instruction-following capabilities, overlooking the fundamental abilities that emerge during the pre-training stage.
1 code implementation • 16 Jan 2024 • Junjie Ye, Yilong Wu, Songyang Gao, Caishuang Huang, Sixian Li, Guanyu Li, Xiaoran Fan, Qi Zhang, Tao Gui, Xuanjing Huang
To bridge this gap, we introduce RoTBench, a multi-level benchmark for evaluating the robustness of LLMs in tool learning.
no code implementations • 12 Jan 2024 • Tianlong Li, Zhenghua Wang, Wenhao Liu, Muling Wu, Shihan Dou, Changze Lv, Xiaohua Wang, Xiaoqing Zheng, Xuanjing Huang
The recent surge in jailbreaking attacks has revealed significant vulnerabilities in Large Language Models (LLMs) when exposed to malicious inputs.
1 code implementation • 11 Jan 2024 • Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, Songyang Gao, Nuo Xu, Yuhao Zhou, Xiaoran Fan, Zhiheng Xi, Jun Zhao, Xiao Wang, Tao Ji, Hang Yan, Lixing Shen, Zhan Chen, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang
We introduce a series of novel methods to mitigate the influence of incorrect and ambiguous preferences in the dataset and fully leverage high-quality preference data.
no code implementations • 2 Jan 2024 • Jun Zhao, Zhihao Zhang, Luhui Gao, Qi Zhang, Tao Gui, Xuanjing Huang
In recent times, substantial advancements have been witnessed in large language models (LLMs), exemplified by ChatGPT, showcasing remarkable proficiency across a range of complex tasks.
1 code implementation • 1 Jan 2024 • Junjie Ye, Guanyu Li, Songyang Gao, Caishuang Huang, Yilong Wu, Sixian Li, Xiaoran Fan, Shihan Dou, Tao Ji, Qi Zhang, Tao Gui, Xuanjing Huang
Existing evaluations of tool learning primarily focus on validating the alignment of selected tools for large language models (LLMs) with expected outcomes.
1 code implementation • 26 Dec 2023 • Wenhao Liu, Xiaohua Wang, Muling Wu, Tianlong Li, Changze Lv, Zixuan Ling, Jianhao Zhu, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang
Aligning large language models (LLMs) with human preferences is crucial for enhancing their utility in terms of helpfulness, truthfulness, safety, harmlessness, and interestingness.
1 code implementation • 21 Dec 2023 • Jiayu Lin, Rong Ye, Meng Han, Qi Zhang, Ruofei Lai, Xinyu Zhang, Zhao Cao, Xuanjing Huang, Zhongyu Wei
The results show the competitiveness of our proposed framework and evaluator in counter-argument generation tasks.
no code implementations • 16 Dec 2023 • Jingyi Zhou, Jie zhou, Jiabao Zhao, Siyin Wang, Haijun Shan, Gui Tao, Qi Zhang, Xuanjing Huang
Few-shot text classification has attracted great interest in both academia and industry due to the lack of labeled data in many fields.
no code implementations • 16 Dec 2023 • Wei Chen, Gang Zhao, Xiaojin Zhang, Xiang Bai, Xuanjing Huang, Zhongyu Wei
Automatic psychological counseling requires mass of professional knowledge that can be found in online counseling forums.
1 code implementation • 15 Dec 2023 • Shihan Dou, Enyu Zhou, Yan Liu, Songyang Gao, Jun Zhao, Wei Shen, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Xiaoran Fan, ShiLiang Pu, Jiang Zhu, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang
Supervised fine-tuning (SFT) is a crucial step for large language models (LLMs), enabling them to align with human instructions and enhance their capabilities in downstream tasks.
no code implementations • 12 Dec 2023 • Yue Zhang, Ming Zhang, Haipeng Yuan, Shichun Liu, Yongyao Shi, Tao Gui, Qi Zhang, Xuanjing Huang
The three crucial questions for LLM evaluation are ``what, where, and how to evaluate''.
1 code implementation • 4 Dec 2023 • Zhangyue Yin, Qiushi Sun, Cheng Chang, Qipeng Guo, Junqi Dai, Xuanjing Huang, Xipeng Qiu
Large Language Models (LLMs) have recently made significant strides in complex reasoning tasks through the Chain-of-Thought technique.
1 code implementation • 1 Dec 2023 • Jingcong Liang, Rong Ye, Meng Han, Qi Zhang, Ruofei Lai, Xinyu Zhang, Zhao Cao, Xuanjing Huang, Zhongyu Wei
In this paper, we propose the Hierarchical Argumentation Graph (Hi-ArG), a new structure to organize arguments.
no code implementations • 2 Nov 2023 • Xin Zhou, Yi Lu, Ruotian Ma, Tao Gui, Qi Zhang, Xuanjing Huang
Specifically, we introduce ``security vectors'', a few new parameters that can be separated from the LLM, to ensure LLM's responses are consistent with the harmful behavior.
no code implementations • 25 Oct 2023 • Tianlong Li, Shihan Dou, Changze Lv, Wenhao Liu, Jianhan Xu, Muling Wu, Zixuan Ling, Xiaoqing Zheng, Xuanjing Huang
Users can utilize UBPL to adjust the probability vectors of predicted words in the decoding phase of LLMs, thus influencing the personality expression of LLMs.
1 code implementation • 23 Oct 2023 • Wei Chen, Qiushi Wang, Zefei Long, Xianyin Zhang, Zhongtian Lu, Bingxuan Li, Siyuan Wang, Jiarong Xu, Xiang Bai, Xuanjing Huang, Zhongyu Wei
We propose Multiple Experts Fine-tuning Framework to build a financial large language model (LLM), DISC-FinLLM.
no code implementations • 23 Oct 2023 • Jun Zhao, Zhihao Zhang, Yide Ma, Qi Zhang, Tao Gui, Luhui Gao, Xuanjing Huang
We have discovered a core region in LLMs that corresponds to linguistic competence, accounting for approximately 1% of the total model parameters.
1 code implementation • 22 Oct 2023 • Xiao Wang, Tianze Chen, Qiming Ge, Han Xia, Rong Bao, Rui Zheng, Qi Zhang, Tao Gui, Xuanjing Huang
In this paper, we propose orthogonal low-rank adaptation (O-LoRA), a simple and efficient approach for continual learning in language models, effectively mitigating catastrophic forgetting while learning new tasks.
1 code implementation • 19 Oct 2023 • Ningyu Xu, Qi Zhang, Jingting Ye, Menghan Zhang, Xuanjing Huang
We then propose a meta-learning-based method to learn to align conceptual spaces of different languages, which facilitates zero-shot and few-shot generalization in concept classification and also offers insights into the cross-lingual in-context learning phenomenon.
no code implementations • 18 Oct 2023 • Rui Zheng, Wei Shen, Yuan Hua, Wenbin Lai, Shihan Dou, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Haoran Huang, Tao Gui, Qi Zhang, Xuanjing Huang
In this work, we propose a novel approach that can learn a consistent policy via RL across various data groups or domains.
no code implementations • 17 Oct 2023 • Enyu Zhou, Rui Zheng, Zhiheng Xi, Songyang Gao, Xiaoran Fan, Zichu Fei, Jingting Ye, Tao Gui, Qi Zhang, Xuanjing Huang
Reports of human-like behaviors in foundation models are growing, with psychological theories providing enduring tools to investigate these behaviors.
1 code implementation • 14 Oct 2023 • Yuxin Wang, Xiannian Hu, Quan Gan, Xuanjing Huang, Xipeng Qiu, David Wipf
In contrast, \emph{edge-wise} methods rely on the formation of edge-specific subgraph embeddings to enrich the representation of pair-wise relationships, disambiguating isomorphic nodes to improve accuracy, but with increased model complexity.
1 code implementation • 14 Oct 2023 • Junjie Ye, Jie zhou, Junfeng Tian, Rui Wang, Qi Zhang, Tao Gui, Xuanjing Huang
Recently, Target-oriented Multimodal Sentiment Classification (TMSC) has gained significant attention among scholars.
1 code implementation • 10 Oct 2023 • Xiao Wang, Yuansen Zhang, Tianze Chen, Songyang Gao, Senjie Jin, Xianjun Yang, Zhiheng Xi, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xuanjing Huang
In this paper, we introduce TRACE, a novel benchmark designed to evaluate continual learning in LLMs.
no code implementations • 10 Oct 2023 • Tianlong Li, Wenhao Liu, Changze Lv, Yufei Gu, Jianhan Xu, Cenyuan Zhang, Muling Wu, Xiaoqing Zheng, Xuanjing Huang
Spiking Neural Networks (SNNs) have emerged as a promising alternative to conventional Artificial Neural Networks (ANNs), demonstrating comparable performance in both visual and linguistic tasks while offering the advantage of improved energy efficiency.
no code implementations • 8 Oct 2023 • Wei Shen, Rui Zheng, WenYu Zhan, Jun Zhao, Shihan Dou, Tao Gui, Qi Zhang, Xuanjing Huang
Reinforcement learning from human feedback serves as a crucial bridge, aligning large language models with human and societal values.
1 code implementation • 4 Oct 2023 • Zejun Li, Ye Wang, Mengfei Du, Qingwen Liu, Binhao Wu, Jiwen Zhang, Chengxing Zhou, Zhihao Fan, Jie Fu, Jingjing Chen, Xuanjing Huang, Zhongyu Wei
Recent years have witnessed remarkable progress in the development of large vision-language models (LVLMs).
2 code implementations • 20 Sep 2023 • Shengbin Yue, Wei Chen, Siyuan Wang, Bingxuan Li, Chenchen Shen, Shujun Liu, Yuxuan Zhou, Yao Xiao, Song Yun, Xuanjing Huang, Zhongyu Wei
We propose DISC-LawLLM, an intelligent legal system utilizing large language models (LLMs) to provide a wide range of legal services.
1 code implementation • 14 Sep 2023 • Zhiheng Xi, Wenxiang Chen, Xin Guo, wei he, Yiwen Ding, Boyang Hong, Ming Zhang, Junzhe Wang, Senjie Jin, Enyu Zhou, Rui Zheng, Xiaoran Fan, Xiao Wang, Limao Xiong, Yuhao Zhou, Weiran Wang, Changhao Jiang, Yicheng Zou, Xiangyang Liu, Zhangyue Yin, Shihan Dou, Rongxiang Weng, Wensen Cheng, Qi Zhang, Wenjuan Qin, Yongyan Zheng, Xipeng Qiu, Xuanjing Huang, Tao Gui
Many efforts have been made to develop intelligent agents, but they mainly focus on advancement in algorithms or training strategies to enhance specific capabilities or performance on particular tasks.
1 code implementation • 29 Aug 2023 • Changze Lv, Tianlong Li, Jianhan Xu, Chenxi Gu, Zixuan Ling, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang
Spiking neural networks (SNNs) offer a promising avenue to implement deep neural networks in a more energy-efficient way.
1 code implementation • 28 Aug 2023 • Zhijie Bao, Wei Chen, Shengze Xiao, Kuang Ren, Jiaao Wu, Cheng Zhong, Jiajie Peng, Xuanjing Huang, Zhongyu Wei
We propose DISC-MedLLM, a comprehensive solution that leverages Large Language Models (LLMs) to provide accurate and truthful medical response in end-to-end conversational healthcare services.
1 code implementation • 11 Jul 2023 • Rui Zheng, Shihan Dou, Songyang Gao, Yuan Hua, Wei Shen, Binghai Wang, Yan Liu, Senjie Jin, Qin Liu, Yuhao Zhou, Limao Xiong, Lu Chen, Zhiheng Xi, Nuo Xu, Wenbin Lai, Minghao Zhu, Cheng Chang, Zhangyue Yin, Rongxiang Weng, Wensen Cheng, Haoran Huang, Tianxiang Sun, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang
Therefore, we explore the PPO-max, an advanced version of PPO algorithm, to efficiently improve the training stability of the policy model.
1 code implementation • 27 Jun 2023 • Songyang Gao, Shihan Dou, Qi Zhang, Xuanjing Huang, Jin Ma, Ying Shan
Detecting adversarial samples that are carefully crafted to fool the model is a critical step to socially-secure applications.
1 code implementation • 16 Jun 2023 • Yuxin Wang, Quan Gan, Xipeng Qiu, Xuanjing Huang, David Wipf
Hypergraphs are a powerful abstraction for representing higher-order interactions between entities of interest.
1 code implementation • 8 Jun 2023 • Jun Zhao, Xin Zhao, WenYu Zhan, Qi Zhang, Tao Gui, Zhongyu Wei, Yunwen Chen, Xiang Gao, Xuanjing Huang
Inspired by text adversarial attacks, we adaptively apply small but critical perturbations to original training instances and thus synthesizing negative instances that are more likely to be mistaken by the model as known relations.
1 code implementation • 29 May 2023 • Zhangyue Yin, Qiushi Sun, Qipeng Guo, Jiawen Wu, Xipeng Qiu, Xuanjing Huang
Large language models (LLMs) have a wealth of knowledge that allows them to excel in various Natural Language Processing (NLP) tasks.
no code implementations • 26 May 2023 • Wei Chen, Shiqi Wei, Zhongyu Wei, Xuanjing Huang
Symptom diagnosis in medical conversations aims to correctly extract both symptom entities and their status from the doctor-patient dialogue.
1 code implementation • 23 May 2023 • Zhiheng Xi, Senjie Jin, Yuhao Zhou, Rui Zheng, Songyang Gao, Tao Gui, Qi Zhang, Xuanjing Huang
To enhance the multi-step reasoning capabilities of large language models, researchers have extensively explored prompting methods, notably the Chain-of-Thought (CoT) method which explicitly elicits human-like rationales.
1 code implementation • 23 May 2023 • Siyuan Wang, Zhongyu Wei, Meng Han, Zhihao Fan, Haijun Shan, Qi Zhang, Xuanjing Huang
The results demonstrate the effectiveness of our method on logical reasoning over KGs in both inductive and transductive settings.
1 code implementation • 21 May 2023 • Limao Xiong, Jie zhou, Qunxi Zhu, Xiao Wang, Yuanbin Wu, Qi Zhang, Tao Gui, Xuanjing Huang, Jin Ma, Ying Shan
Particularly, we propose a Confidence-based Partial Label Learning (CPLL) method to integrate the prior confidence (given by annotators) and posterior confidences (learned by models) for crowd-annotated NER.
1 code implementation • 20 May 2023 • Ting Wu, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang
Models trained with empirical risk minimization (ERM) are revealed to easily rely on spurious correlations, resulting in poor generalization.
no code implementations • 11 May 2023 • Ting Wu, Jingyi Liu, Rui Zheng, Qi Zhang, Tao Gui, Xuanjing Huang
The principle of continual relation extraction~(CRE) involves adapting to emerging novel relations while preserving od knowledge.
1 code implementation • 9 May 2023 • Peng Li, Tianxiang Sun, Qiong Tang, Hang Yan, Yuanbin Wu, Xuanjing Huang, Xipeng Qiu
A common practice is to recast the task into a text-to-text format such that generative LLMs of natural language (NL-LLMs) like GPT-3 can be prompted to solve it.
no code implementations • 4 May 2023 • Songyang Gao, Shihan Dou, Junjie Shan, Qi Zhang, Xuanjing Huang
Dataset bias, i. e., the over-reliance on dataset-specific literal heuristics, is getting increasing attention for its detrimental effect on the generalization ability of NLU models.
no code implementations • 18 Mar 2023 • Junjie Ye, Xuanting Chen, Nuo Xu, Can Zu, Zekai Shao, Shichun Liu, Yuhan Cui, Zeyang Zhou, Chao Gong, Yang shen, Jie zhou, Siming Chen, Tao Gui, Qi Zhang, Xuanjing Huang
GPT series models, such as GPT-3, CodeX, InstructGPT, ChatGPT, and so on, have gained considerable attention due to their exceptional natural language processing capabilities.
no code implementations • 1 Mar 2023 • Xuanting Chen, Junjie Ye, Can Zu, Nuo Xu, Rui Zheng, Minlong Peng, Jie zhou, Tao Gui, Qi Zhang, Xuanjing Huang
The GPT-3. 5 models have demonstrated impressive performance in various Natural Language Processing (NLP) tasks, showcasing their strong understanding and reasoning capabilities.
Natural Language Inference
Natural Language Understanding
+1
1 code implementation • 21 Dec 2022 • Ningyu Xu, Tao Gui, Ruotian Ma, Qi Zhang, Jingting Ye, Menghan Zhang, Xuanjing Huang
We demonstrate that the distance between the distributions of different languages is highly consistent with the syntactic difference in terms of linguistic formalisms.
2 code implementations • 19 Dec 2022 • Zhangyue Yin, Yuxin Wang, Xiannian Hu, Yiguang Wu, Hang Yan, Xinyu Zhang, Zhao Cao, Xuanjing Huang, Xipeng Qiu
Multi-Hop Question Answering (MHQA) is a significant area in question answering, requiring multiple reasoning components, including document retrieval, supporting sentence prediction, and answer span extraction.
1 code implementation • 28 Nov 2022 • Zhengfu He, Tianxiang Sun, Kuanning Wang, Xuanjing Huang, Xipeng Qiu
We present DiffusionBERT, a new generative masked language model based on discrete diffusion models.
1 code implementation • 14 Nov 2022 • Zhiheng Xi, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang
Adversarial training is one of the most powerful methods to improve the robustness of pre-trained language models (PLMs).
2 code implementations • ACL 2022 • Rui Zheng, Rong Bao, Yuhao Zhou, Di Liang, Sirui Wang, Wei Wu, Tao Gui, Qi Zhang, Xuanjing Huang
Recent works on Lottery Ticket Hypothesis have shown that pre-trained language models (PLMs) contain smaller matching subnetworks(winning tickets) which are capable of reaching accuracy comparable to the original models.
1 code implementation • 20 Oct 2022 • Xiangyang Liu, Tianxiang Sun, Xuanjing Huang, Xipeng Qiu
Through extensive experimental results across various tasks and PTMs, we show that LPT can achieve competitive performance to full model tuning and other PETuning methods under both full-data and few-shot scenarios while possessing faster training speed and lower memory cost.
2 code implementations • 14 Oct 2022 • Songyang Gao, Shihan Dou, Qi Zhang, Xuanjing Huang
Dataset bias has attracted increasing attention recently for its detrimental effect on the generalization ability of fine-tuned models.
1 code implementation • 14 Oct 2022 • Tianxiang Sun, Junliang He, Xipeng Qiu, Xuanjing Huang
Automatic evaluation metrics are crucial to the development of generative systems.
1 code implementation • 14 Oct 2022 • Tianxiang Sun, Zhengfu He, Qin Zhu, Xipeng Qiu, Xuanjing Huang
MP2 is a set of combinable prompts pre-trained on 38 Chinese tasks.
1 code implementation • COLING 2022 • Chenxin An, Ming Zhong, Zhiyong Wu, Qin Zhu, Xuanjing Huang, Xipeng Qiu
Traditional training paradigms for extractive and abstractive summarization systems always only use token-level or sentence-level training objectives.
no code implementations • COLING 2022 • Jie zhou, Qi Zhang, Qin Chen, Liang He, Xuanjing Huang
Event argument extraction (EAE) aims to extract arguments with given roles from texts, which have been widely studied in natural language processing.
1 code implementation • COLING 2022 • Siyuan Wang, Zhongyu Wei, Zhihao Fan, Qi Zhang, Xuanjing Huang
In this paper, we propose an interpretable stepwise reasoning framework to incorporate both single-hop supporting sentence identification and single-hop question generation at each intermediate step, and utilize the inference of the current hop for the next until reasoning out the final result.
no code implementations • COLING 2022 • Siyin Wang, Jie zhou, Changzhi Sun, Junjie Ye, Tao Gui, Qi Zhang, Xuanjing Huang
In this work, we propose a causal intervention model for Implicit Sentiment Analysis using Instrumental Variable (ISAIV).
no code implementations • 11 Jun 2022 • Zhihao Fan, Zhongyu Wei, Jingjing Chen, Siyuan Wang, Zejun Li, Jiarong Xu, Xuanjing Huang
These two steps are iteratively performed in our framework for continuous learning.
2 code implementations • 29 May 2022 • Chenxin An, Jiangtao Feng, Kai Lv, Lingpeng Kong, Xipeng Qiu, Xuanjing Huang
We validate CoNT on five generation tasks with ten benchmarks, including machine translation, summarization, code comment generation, data-to-text generation and commonsense generation.
1 code implementation • 27 May 2022 • Yuxin Wang, Chu-Tak Lee, Qipeng Guo, Zhangyue Yin, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu
Transformers have made progress in miscellaneous tasks, but suffer from quadratic computational and memory complexities.
1 code implementation • 23 May 2022 • Tianxiang Sun, Zhengfu He, Hong Qian, Yunhua Zhou, Xuanjing Huang, Xipeng Qiu
By contrast, gradient-free methods only require the forward computation of the PTM to tune the prompt, retaining the benefits of efficient tuning and deployment.
1 code implementation • 19 Apr 2022 • Wei Chen, Zhiwei Li, Hongyi Fang, Qianyuan Yao, Cheng Zhong, Jianye Hao, Qi Zhang, Xuanjing Huang, Jiajie Peng, Zhongyu Wei
In recent years, interest has arisen in using machine learning to improve the efficiency of automatic medical consultation and enhance patient experience.
2 code implementations • ACL 2022 • Xiao Wang, Shihan Dou, Limao Xiong, Yicheng Zou, Qi Zhang, Tao Gui, Liang Qiao, Zhanzhan Cheng, Xuanjing Huang
NER model has achieved promising performance on standard NER benchmarks.
Ranked #10 on
Named Entity Recognition (NER)
on WNUT 2017
1 code implementation • Findings (ACL) 2022 • Tianxiang Sun, Xiangyang Liu, Wei Zhu, Zhichao Geng, Lingling Wu, Yilong He, Yuan Ni, Guotong Xie, Xuanjing Huang, Xipeng Qiu
Previous works usually adopt heuristic metrics such as the entropy of internal outputs to measure instance difficulty, which suffers from generalization and threshold-tuning.
2 code implementations • COLING 2022 • Shihan Dou, Rui Zheng, Ting Wu, Songyang Gao, Junjie Shan, Qi Zhang, Yueming Wu, Xuanjing Huang
Most of the existing debiasing methods often identify and weaken these samples with biased features (i. e., superficial surface features that cause such spurious correlations).
1 code implementation • 29 Jan 2022 • Zejun Li, Zhihao Fan, Huaixiao Tou, Jingjing Chen, Zhongyu Wei, Xuanjing Huang
In MVPTR, we follow the nested structure of both modalities to introduce concepts as high-level semantics.
2 code implementations • 10 Jan 2022 • Tianxiang Sun, Yunfan Shao, Hong Qian, Xuanjing Huang, Xipeng Qiu
In such a scenario, which we call Language-Model-as-a-Service (LMaaS), the gradients of PTMs are usually unavailable.
no code implementations • 14 Oct 2021 • Xin Zhou, Ruotian Ma, Tao Gui, Yiding Tan, Qi Zhang, Xuanjing Huang
Specifically, for each task, a label word set is first constructed by selecting a high-frequency word for each class respectively, and then, task-specific vectors are inserted into the inputs and optimized to manipulate the model predictions towards the corresponding label words.
1 code implementation • NAACL 2022 • Xiangyang Liu, Tianxiang Sun, Junliang He, Jiawen Wu, Lingling Wu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu
ELUE is dedicated to depict the Pareto Frontier for various language understanding tasks, such that it can tell whether and how much a method achieves Pareto improvement.
1 code implementation • 6 Oct 2021 • Linyang Li, Demin Song, Ruotian Ma, Xipeng Qiu, Xuanjing Huang
Pre-trained models are widely used in fine-tuning downstream tasks with linear classifiers optimized by the cross-entropy loss, which might face robustness and stability problems.
1 code implementation • NAACL 2022 • Ruotian Ma, Xin Zhou, Tao Gui, Yiding Tan, Linyang Li, Qi Zhang, Xuanjing Huang
Prompt-based methods have been successfully applied in sentence-level few-shot learning tasks, mostly owing to the sophisticated design of templates and label words.
1 code implementation • 26 Sep 2021 • Tianxiang Sun, Xiangyang Liu, Xipeng Qiu, Xuanjing Huang
In this paper, we review such phenomenon of paradigm shifts in recent years, highlighting several paradigms that have the potential to solve different NLP tasks.
no code implementations • 12 Sep 2021 • Zhihao Fan, Zhongyu Wei, Zejun Li, Siyuan Wang, Haijun Shan, Xuanjing Huang, Jianqing Fan
Existing research for image text retrieval mainly relies on sentence-level supervision to distinguish matched and mismatched sentences for a query image.
no code implementations • 10 Sep 2021 • Yitao Liu, Tianxiang Sun, Xipeng Qiu, Xuanjing Huang
This one-way interaction leads to the teacher's inability to perceive the characteristics of the student and its training progress.
no code implementations • 19 Aug 2021 • Tong Liu, Siyuan Wang, Jingchao Fu, Lei Chen, Zhongyu Wei, Yaqi Liu, Heng Ye, Liaosa Xu, Weiqiang Wan, Xuanjing Huang
Existing system dealing with online complaint provides a final decision without explanations.
1 code implementation • ACL 2021 • Yi Zhou, Xiaoqing Zheng, Cho-Jui Hsieh, Kai-Wei Chang, Xuanjing Huang
Although deep neural networks have achieved prominent performance on many NLP tasks, they are vulnerable to adversarial examples.
no code implementations • ACL 2021 • Xinyi Mou, Zhongyu Wei, Lei Chen, Shangyi Ning, Yancheng He, Changjian Jiang, Xuanjing Huang
In addition, we propose a novel task, namely hashtag usage prediction to model the ideology of legislators on Twitter.
1 code implementation • ACL 2021 • Qinzhuo Wu, Qi Zhang, Zhongyu Wei, Xuanjing Huang
In recent years, math word problem solving has received considerable attention and achieved promising results, but previous methods rarely take numerical values into consideration.
1 code implementation • ACL 2021 • Ruotian Ma, Tao Gui, Linyang Li, Qi Zhang, Yaqian Zhou, Xuanjing Huang
In this work, we propose the use of negative training (NT), in which a model is trained using complementary labels regarding that ``the instance does not belong to these complementary labels".