no code implementations • dialdoc (ACL) 2022 • Tianda Li, Jia-Chen Gu, Zhen-Hua Ling, Quan Liu
When multiple conversations occur simultaneously, a listener must decide which conversation each utterance is part of in order to interpret and respond to it appropriately.
1 code implementation • 29 Jul 2024 • Canyu Chen, Baixiang Huang, Zekun Li, Zhaorun Chen, Shiyang Lai, Xiongxiao Xu, Jia-Chen Gu, Jindong Gu, Huaxiu Yao, Chaowei Xiao, Xifeng Yan, William Yang Wang, Philip Torr, Dawn Song, Kai Shu
Then, we find that editing attacks can inject both types of misinformation into LLMs, and the effectiveness is particularly high for commonsense misinformation injection.
no code implementations • 22 Jul 2024 • Mengru Wang, Yunzhi Yao, Ziwen Xu, Shuofei Qiao, Shumin Deng, Peng Wang, Xiang Chen, Jia-Chen Gu, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen, Ningyu Zhang
Understanding knowledge mechanisms in Large Language Models (LLMs) is crucial for advancing towards trustworthy AGI.
no code implementations • 19 Jun 2024 • Di wu, Jia-Chen Gu, Fan Yin, Nanyun Peng, Kai-Wei Chang
Retrieval-augmented language models (RALMs) have shown strong performance and wide applicability in knowledge-intensive tasks.
no code implementations • 27 May 2024 • Jun-Yu Ma, Hong Wang, Hao-Xiang Xu, Zhen-Hua Ling, Jia-Chen Gu
In this paper, we first theoretically analyze that the factor affecting the general abilities in sequential model editing lies in the condition number of the edited matrix.
no code implementations • 25 May 2024 • Yun Zhu, Jia-Chen Gu, Caitlin Sikora, Ho Ko, Yinxiao Liu, Chu-Cheng Lin, Lei Shu, Liangchen Luo, Lei Meng, Bang Liu, Jindong Chen
However, the input length grows linearly in the number of retrieved documents, causing a dramatic increase in latency.
no code implementations • 15 Mar 2024 • Qian Wang, Jia-Chen Gu, Zhen-Hua Ling
Audio-text retrieval (ATR), which retrieves a relevant caption given an audio clip (A2T) and vice versa (T2A), has recently attracted much research attention.
1 code implementation • 31 Jan 2024 • Jun-Yu Ma, Zhen-Hua Ling, Ningyu Zhang, Jia-Chen Gu
A metric of additivity is introduced and a benchmark dubbed as Perturbation Evaluation of Appending Knowledge (PEAK) is constructed to evaluate the degree of perturbation to neighboring knowledge when appending new knowledge.
2 code implementations • 29 Jan 2024 • Shi-Qi Yan, Jia-Chen Gu, Yun Zhu, Zhen-Hua Ling
Experiments on four datasets covering short- and long-form generation tasks show that CRAG can significantly improve the performance of RAG-based approaches.
1 code implementation • 13 Jan 2024 • Zhen Li, Xiaohan Xu, Tao Shen, Can Xu, Jia-Chen Gu, Yuxuan Lai, Chongyang Tao, Shuai Ma
In the rapidly evolving domain of Natural Language Generation (NLG) evaluation, introducing Large Language Models (LLMs) has opened new avenues for assessing generated content quality, e. g., coherence, creativity, and context relevance.
1 code implementation • 9 Jan 2024 • Jia-Chen Gu, Hao-Xiang Xu, Jun-Yu Ma, Pan Lu, Zhen-Hua Ling, Kai-Wei Chang, Nanyun Peng
While current model editing methods can effectively modify a model's behavior within a specific area of interest, they often overlook the potential unintended side effects on the general abilities of LLMs such as reasoning, natural language inference, and question answering.
2 code implementations • 2 Jan 2024 • Ningyu Zhang, Yunzhi Yao, Bozhong Tian, Peng Wang, Shumin Deng, Mengru Wang, Zekun Xi, Shengyu Mao, Jintian Zhang, Yuansheng Ni, Siyuan Cheng, Ziwen Xu, Xin Xu, Jia-Chen Gu, Yong Jiang, Pengjun Xie, Fei Huang, Lei Liang, Zhiqiang Zhang, Xiaowei Zhu, Jun Zhou, Huajun Chen
In this paper, we first define the knowledge editing problem and then provide a comprehensive review of cutting-edge approaches.
Ranked #1 on knowledge editing on zsRE (using extra training data)
1 code implementation • 25 Oct 2023 • Chao-Hong Tan, Jia-Chen Gu, Zhen-Hua Ling
Large Language Models (LLMs) have emerged as influential instruments within the realm of natural language processing; nevertheless, their capacity to handle multi-party conversations (MPCs) -- a scenario marked by the presence of multiple interlocutors involved in intricate information exchanges -- remains uncharted.
1 code implementation • 16 Oct 2023 • Jun-Yu Ma, Jia-Chen Gu, Zhen-Hua Ling, Quan Liu, Cong Liu
A new evaluation metric of reversibility is introduced, and a benchmark dubbed as Bidirectional Assessment for Knowledge Editing (BAKE) is constructed to evaluate the reversibility of edited models in recalling knowledge in the reverse direction of editing.
1 code implementation • 22 May 2023 • Jia-Chen Gu, Chao-Hong Tan, Caiyuan Chu, Zhen-Hua Ling, Chongyang Tao, Quan Liu, Cong Liu
Given an MPC with a few addressee labels missing, existing methods fail to build a consecutively connected conversation graph, but only a few separate conversation fragments instead.
no code implementations • 21 May 2023 • Jun-Yu Ma, Jia-Chen Gu, Zhen-Hua Ling, Quan Liu, Cong Liu, Guoping Hu
The proposed encoder is capable of interactively capturing complementary information between features and contextual information, to derive language-agnostic representations for various IE tasks.
no code implementations • 19 May 2023 • Chao-Hong Tan, Jia-Chen Gu, Zhen-Hua Ling
In fact, the encoder-decoder architecture is naturally more flexible for its detachable encoder and decoder modules, which is extensible to multilingual and multimodal generation tasks for conditions and target texts.
1 code implementation • 16 May 2023 • Jia-Chen Gu, Zhen-Hua Ling, Quan Liu, Cong Liu, Guoping Hu
Addressing the issues of who saying what to whom in multi-party conversations (MPCs) has recently attracted a lot of research attention.
no code implementations • 4 May 2023 • Jun-Yu Ma, Jia-Chen Gu, Jiajun Qi, Zhen-Hua Ling, Quan Liu, Xiaoyi Zhao
A method named Statistical Construction and Dual Adaptation of Gazetteer (SCDAG) is proposed for Multilingual Complex NER.
no code implementations • 9 Mar 2023 • Caiyuan Chu, Ya Li, Yifan Liu, Jia-Chen Gu, Quan Liu, Yongxin Ge, Guoping Hu
The key to automatic intention induction is that, for any given set of new data, the sentence representation obtained by the model can be well distinguished from different labels.
1 code implementation • 7 Dec 2022 • Jun-Yu Ma, Beiduo Chen, Jia-Chen Gu, Zhen-Hua Ling, Wu Guo, Quan Liu, Zhigang Chen, Cong Liu
In this study, a mixture of short-channel distillers (MSD) method is proposed to fully interact the rich hierarchical information in the teacher model and to transfer knowledge to the student model sufficiently and efficiently.
1 code implementation • Findings (ACL) 2022 • Chao-Hong Tan, Jia-Chen Gu, Chongyang Tao, Zhen-Hua Ling, Can Xu, Huang Hu, Xiubo Geng, Daxin Jiang
To address the problem, we propose augmenting TExt Generation via Task-specific and Open-world Knowledge (TegTok) in a unified framework.
1 code implementation • ACL 2022 • Jia-Chen Gu, Chao-Hong Tan, Chongyang Tao, Zhen-Hua Ling, Huang Hu, Xiubo Geng, Daxin Jiang
To address these challenges, we present HeterMPC, a heterogeneous graph-based neural network for response generation in MPCs which models the semantics of utterances and interlocutors simultaneously with two types of nodes in a graph.
1 code implementation • EMNLP 2021 • Jia-Chen Gu, Zhen-Hua Ling, Yu Wu, Quan Liu, Zhigang Chen, Xiaodan Zhu
This is a many-to-many semantic matching task because both contexts and personas in SPD are composed of multiple sentences.
1 code implementation • ACL 2021 • Jia-Chen Gu, Chongyang Tao, Zhen-Hua Ling, Can Xu, Xiubo Geng, Daxin Jiang
Recently, various neural models for multi-party conversation (MPC) have achieved impressive improvements on a variety of tasks such as addressee recognition, speaker identification and response prediction.
1 code implementation • 19 May 2021 • Jia-Chen Gu, Hui Liu, Zhen-Hua Ling, Quan Liu, Zhigang Chen, Xiaodan Zhu
Empirical studies on the Persona-Chat dataset show that the partner personas neglected in previous studies can improve the accuracy of response selection in the IMN- and BERT-based models.
1 code implementation • 22 Dec 2020 • Chao-Hong Tan, Xiaoyu Yang, Zi'ou Zheng, Tianda Li, Yufei Feng, Jia-Chen Gu, Quan Liu, Dan Liu, Zhen-Hua Ling, Xiaodan Zhu
Task-oriented conversational modeling with unstructured knowledge access, as track 1 of the 9th Dialogue System Technology Challenges (DSTC 9), requests to build a system to generate response given dialogue history and knowledge access.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Jia-Chen Gu, Zhen-Hua Ling, Quan Liu, Zhigang Chen, Xiaodan Zhu
The challenges of building knowledge-grounded retrieval-based chatbots lie in how to ground a conversation on its background knowledge and how to match response candidates with both context and knowledge simultaneously.
1 code implementation • 8 Apr 2020 • Tianda Li, Jia-Chen Gu, Xiaodan Zhu, Quan Liu, Zhen-Hua Ling, Zhiming Su, Si Wei
Disentanglement is a problem in which multiple conversations occur in the same channel simultaneously, and the listener should decide which utterance is part of the conversation he will respond to.
2 code implementations • 7 Apr 2020 • Jia-Chen Gu, Tianda Li, Quan Liu, Zhen-Hua Ling, Zhiming Su, Si Wei, Xiaodan Zhu
In this paper, we study the problem of employing pre-trained language models for multi-turn response selection in retrieval-based chatbots.
no code implementations • 4 Apr 2020 • Jia-Chen Gu, Tianda Li, Quan Liu, Xiaodan Zhu, Zhen-Hua Ling, Yu-Ping Ruan
The NOESIS II challenge, as the Track 2 of the 8th Dialogue System Technology Challenges (DSTC 8), is the extension of DSTC 7.
Ranked #1 on Conversation Disentanglement on irc-disentanglement
no code implementations • 1 Feb 2020 • Yu-Ping Ruan, Zhen-Hua Ling, Jia-Chen Gu, Quan Liu
We present our work on Track 4 in the Dialogue System Technology Challenges 8 (DSTC8).
1 code implementation • 16 Nov 2019 • Jia-Chen Gu, Zhen-Hua Ling, Quan Liu
The distances between context and response utterances are employed as a prior component when calculating the attention weights.
Ranked #10 on Conversational Response Selection on E-commerce
1 code implementation • IJCNLP 2019 • Jia-Chen Gu, Zhen-Hua Ling, Xiaodan Zhu, Quan Liu
Compared with previous persona fusion approaches which enhance the representation of a context by calculating its similarity with a given persona, the DIM model adopts a dual matching architecture, which performs interactive matching between responses and contexts and between responses and personas respectively for ranking response candidates.
no code implementations • 27 Jan 2019 • Yu-Ping Ruan, Zhen-Hua Ling, Quan Liu, Jia-Chen Gu, Xiaodan Zhu
At this stage, two different models are proposed, i. e., a variational generative (VariGen) model and a retrieval based (Retrieval) model.
1 code implementation • 7 Jan 2019 • Jia-Chen Gu, Zhen-Hua Ling, Quan Liu
In this paper, we propose an interactive matching network (IMN) for the multi-turn response selection task.
Ranked #9 on Conversational Response Selection on E-commerce
1 code implementation • 3 Dec 2018 • Jia-Chen Gu, Zhen-Hua Ling, Yu-Ping Ruan, Quan Liu
This paper presents an end-to-end response selection model for Track 1 of the 7th Dialogue System Technology Challenges (DSTC7).
Ranked #5 on Conversational Response Selection on DSTC7 Ubuntu