no code implementations • Findings (EMNLP) 2021 • Yuejie Lei, Fujia Zheng, Yuanmeng Yan, Keqing He, Weiran Xu
Although abstractive summarization models have achieved impressive results on document summarization tasks, their performance on dialogue modeling is much less satisfactory due to the crude and straight methods for dialogue encoding.
Abstractive Dialogue Summarization
Abstractive Text Summarization
+1
1 code implementation • ACL 2022 • Yutao Mou, Keqing He, Yanan Wu, Zhiyuan Zeng, Hong Xu, Huixing Jiang, Wei Wu, Weiran Xu
Discovering Out-of-Domain(OOD) intents is essential for developing new skills in a task-oriented dialogue system.
no code implementations • EMNLP 2020 • Yuanmeng Yan, Keqing He, Hong Xu, Sihong Liu, Fanyu Meng, Min Hu, Weiran Xu
Open-vocabulary slots, such as file name, album name, or schedule title, significantly degrade the performance of neural-based slot filling models since these slots can take on values from a virtually unlimited set and have no semantic restriction nor a length limit.
1 code implementation • NAACL 2022 • Yanan Wu, Keqing He, Yuanmeng Yan, QiXiang Gao, Zhiyuan Zeng, Fujia Zheng, Lulu Zhao, Huixing Jiang, Wei Wu, Weiran Xu
Detecting Out-of-Domain (OOD) or unknown intents from user queries is essential in a task-oriented dialog system.
1 code implementation • 24 Mar 2025 • Weihao Zeng, Yuzhen Huang, Qian Liu, Wei Liu, Keqing He, Zejun Ma, Junxian He
DeepSeek-R1 has shown that long chain-of-thought (CoT) reasoning can naturally emerge through a simple reinforcement learning (RL) framework with rule-based rewards, where the training may directly start from the base models-a paradigm referred to as zero RL training.
no code implementations • 3 Jan 2025 • Dayuan Fu, Keqing He, Yejie Wang, Wentao Hong, Zhuoma Gongque, Weihao Zeng, Wei Wang, Jingang Wang, Xunliang Cai, Weiran Xu
We analyze that the poor generalization ability comes from overfitting to several manual agent environments and a lack of adaptation to new situations.
no code implementations • 8 Oct 2024 • Siqi Wang, Zhengyu Chen, Bei Li, Keqing He, Min Zhang, Jingang Wang
The scaling of large language models (LLMs) is a critical research area for the efficiency and effectiveness of model training and deployment.
1 code implementation • 5 Sep 2024 • Yejie Wang, Keqing He, Dayuan Fu, Zhuoma Gongque, Heyang Xu, Yanxu Chen, Zhexu Wang, Yujia Fu, Guanting Dong, Muxi Diao, Jingang Wang, Mengdi Zhang, Xunliang Cai, Weiran Xu
Based on our selected data, we present XCoder, a family of models finetuned from LLaMA3.
no code implementations • 31 Mar 2024 • Weihao Zeng, Dayuan Fu, Keqing He, Yejie Wang, Yukai Xu, Weiran Xu
Language models pre-trained on general text have achieved impressive results in diverse fields.
no code implementations • 2 Mar 2024 • Weihao Zeng, Keqing He, Yejie Wang, Dayuan Fu, Weiran Xu
Pre-trained language models have been successful in many scenarios.
no code implementations • 27 Feb 2024 • Pei Wang, Keqing He, Yejie Wang, Xiaoshuai Song, Yutao Mou, Jingang Wang, Yunsen Xian, Xunliang Cai, Weiran Xu
Out-of-domain (OOD) intent detection aims to examine whether the user's query falls outside the predefined domain of the system, which is crucial for the proper functioning of task-oriented dialogue (TOD) systems.
1 code implementation • 18 Feb 2024 • Dayuan Fu, Jianzhao Huang, Siyuan Lu, Guanting Dong, Yejie Wang, Keqing He, Weiran Xu
Addressing the disparity between forecasts and actual results can enable individuals to expand their thought processes and stimulate self-reflection, thus promoting accurate planning.
no code implementations • 17 Feb 2024 • Pei Wang, Yejie Wang, Muxi Diao, Keqing He, Guanting Dong, Weiran Xu
In this work, we focus on improving the confidence estimation of large language models.
1 code implementation • 14 Feb 2024 • Yejie Wang, Keqing He, Guanting Dong, Pei Wang, Weihao Zeng, Muxi Diao, Yutao Mou, Mengdi Zhang, Jingang Wang, Xunliang Cai, Weiran Xu
It learns diverse instruction targets and combines a code evaluation objective to enhance its code generation ability.
1 code implementation • 13 Feb 2024 • Xiaoshuai Song, Zhengyang Wang, Keqing He, Guanting Dong, Yutao Mou, Jinxu Zhao, Weiran Xu
Knowledge editing (KE) aims to efficiently and precisely modify the behavior of large language models (LLMs) to update specific knowledge without negatively influencing other knowledge.
1 code implementation • 25 Dec 2023 • Wei Liu, Weihao Zeng, Keqing He, Yong Jiang, Junxian He
We present deita (short for Data-Efficient Instruction Tuning for Alignment), a series of models fine-tuned from LLaMA and Mistral models using data samples automatically selected with our proposed approach.
no code implementations • 20 Oct 2023 • Pei Wang, Keqing He, Yutao Mou, Xiaoshuai Song, Yanan Wu, Jingang Wang, Yunsen Xian, Xunliang Cai, Weiran Xu
Detecting out-of-domain (OOD) intents from user queries is essential for a task-oriented dialogue system.
1 code implementation • 16 Oct 2023 • Guanting Dong, Tingfeng Hui, Zhuoma Gongque, Jinxu Zhao, Daichi Guo, Gang Zhao, Keqing He, Weiran Xu
Recently, prompt-based generative frameworks have shown impressive capabilities in sequence labeling tasks.
1 code implementation • 16 Oct 2023 • Xiaoshuai Song, Yutao Mou, Keqing He, Yueyan Qiu, Pei Wang, Weiran Xu
In a practical dialogue system, users may input out-of-domain (OOD) queries.
1 code implementation • 16 Oct 2023 • Xiaoshuai Song, Keqing He, Pei Wang, Guanting Dong, Yutao Mou, Jingang Wang, Yunsen Xian, Xunliang Cai, Weiran Xu
The tasks of out-of-domain (OOD) intent discovery and generalized intent discovery (GID) aim to extend a closed intent classifier to open-world intent sets, which is crucial to task-oriented dialogue (TOD) systems.
1 code implementation • 10 Oct 2023 • Guanting Dong, Jinxu Zhao, Tingfeng Hui, Daichi Guo, Wenlong Wan, Boqi Feng, Yueyan Qiu, Zhuoma Gongque, Keqing He, Zechen Wang, Weiran Xu
To address these challenges, we propose a unified robustness evaluation framework based on the slot-filling task to systematically evaluate the dialogue understanding capability of LLMs in diverse input perturbation scenarios.
no code implementations • 5 Oct 2023 • Jiachi Liu, LiWen Wang, Guanting Dong, Xiaoshuai Song, Zechen Wang, Zhengyang Wang, Shanglin Lei, Jinzheng Zhao, Keqing He, Bo Xiao, Weiran Xu
The proposed dataset contains five types of human-annotated noise, and all those noises are exactly existed in real extensive robust-training methods of slot filling into the proposed framework.
1 code implementation • 28 Aug 2023 • Guanting Dong, Zechen Wang, Jinxu Zhao, Gang Zhao, Daichi Guo, Dayuan Fu, Tingfeng Hui, Chen Zeng, Keqing He, Xuefeng Li, LiWen Wang, Xinyue Cui, Weiran Xu
The objective of few-shot named entity recognition is to identify named entities with limited labeled instances.
Ranked #1 on
Few-shot NER
on Few-NERD (INTER)
1 code implementation • 6 Jul 2023 • Xuefeng Li, LiWen Wang, Guanting Dong, Keqing He, Jinzheng Zhao, Hao Lei, Jiachi Liu, Weiran Xu
Zero-shot cross-domain slot filling aims to transfer knowledge from the labeled source domain to the unlabeled target domain.
1 code implementation • 17 Jun 2023 • Weihao Zeng, Lulu Zhao, Keqing He, Ruotong Geng, Jingang Wang, Wei Wu, Weiran Xu
In this paper, we explore the compositional generalization for multi-attribute controllable dialogue generation where a model can learn from seen attribute values and generalize to unseen combinations.
1 code implementation • 17 Jun 2023 • Weihao Zeng, Keqing He, Yejie Wang, Chen Zeng, Jingang Wang, Yunsen Xian, Weiran Xu
Pre-trained language models based on general text enable huge success in the NLP scenario.
1 code implementation • 11 Jun 2023 • Shicheng Tan, Weng Lam Tam, Yuanchun Wang, Wenwen Gong, Yang Yang, Hongyin Tang, Keqing He, Jiahao Liu, Jingang Wang, Shu Zhao, Peng Zhang, Jie Tang
Currently, the reduction in the parameter scale of large-scale pre-trained language models (PLMs) through knowledge distillation has greatly facilitated their widespread deployment on various devices.
1 code implementation • 28 May 2023 • Yutao Mou, Xiaoshuai Song, Keqing He, Chen Zeng, Pei Wang, Jingang Wang, Yunsen Xian, Weiran Xu
Previous methods suffer from a coupling of pseudo label disambiguation and representation learning, that is, the reliability of pseudo labels relies on representation learning, and representation learning is restricted by pseudo labels in turn.
no code implementations • 27 Feb 2023 • Guanting Dong, Zechen Wang, LiWen Wang, Daichi Guo, Dayuan Fu, Yuxiang Wu, Chen Zeng, Xuefeng Li, Tingfeng Hui, Keqing He, Xinyue Cui, QiXiang Gao, Weiran Xu
Specifically, we decouple class-specific prototypes and contextual semantic prototypes by two masking strategies to lead the model to focus on two different semantic information for inference.
no code implementations • 27 Feb 2023 • Daichi Guo, Guanting Dong, Dayuan Fu, Yuxiang Wu, Chen Zeng, Tingfeng Hui, LiWen Wang, Xuefeng Li, Zechen Wang, Keqing He, Xinyue Cui, Weiran Xu
In real dialogue scenarios, the existing slot filling model, which tends to memorize entity patterns, has a significantly reduced generalization facing Out-of-Vocabulary (OOV) problems.
1 code implementation • 21 Dec 2022 • Jiakang Xu, Wolfgang Mayer, Hongyu Zhang, Keqing He, Zaiwen Feng
Therefore, an automatic approach for learning the semantics of a data source is desirable.
1 code implementation • 19 Oct 2022 • Yutao Mou, Pei Wang, Keqing He, Yanan Wu, Jingang Wang, Wei Wu, Weiran Xu
Specifically, we design a K-nearest neighbor contrastive learning (KNCL) objective for representation learning and introduce a KNN-based scoring function for OOD detection.
1 code implementation • 17 Oct 2022 • Yutao Mou, Keqing He, Pei Wang, Yanan Wu, Jingang Wang, Wei Wu, Weiran Xu
For OOD clustering stage, we propose a KCC method to form compact clusters by mining true hard negative samples, which bridges the gap between clustering and representation learning.
1 code implementation • 17 Oct 2022 • Weihao Zeng, Keqing He, Zechen Wang, Dayuan Fu, Guanting Dong, Ruotong Geng, Pei Wang, Jingang Wang, Chaobo Sun, Wei Wu, Weiran Xu
Recent advances in neural approaches greatly improve task-oriented dialogue (TOD) systems which assist users to accomplish their goals.
no code implementations • 17 Oct 2022 • Yanan Wu, Zhiyuan Zeng, Keqing He, Yutao Mou, Pei Wang, Yuanmeng Yan, Weiran Xu
In this paper, we propose a simple but strong energy-based score function to detect OOD where the energy scores of OOD samples are higher than IND samples.
1 code implementation • COLING 2022 • Yanan Wu, Zhiyuan Zeng, Keqing He, Yutao Mou, Pei Wang, Weiran Xu
Out-of-Domain (OOD) detection is a key component in a task-oriented dialog system, which aims to identify whether a query falls outside the predefined supported intent set.
1 code implementation • COLING 2022 • Yutao Mou, Keqing He, Yanan Wu, Pei Wang, Jingang Wang, Wei Wu, Yi Huang, Junlan Feng, Weiran Xu
Traditional intent classification models are based on a pre-defined intent set and only recognize limited in-domain (IND) intent classes.
no code implementations • 31 Aug 2022 • Keqing He, Jingang Wang, Chaobo Sun, Wei Wu
In this paper, we propose a novel unified knowledge prompt pre-training framework, UFA (\textbf{U}nified Model \textbf{F}or \textbf{A}ll Tasks), for customer service dialogues.
no code implementations • COLING 2022 • Guanting Dong, Daichi Guo, LiWen Wang, Xuefeng Li, Zechen Wang, Chen Zeng, Keqing He, Jinzheng Zhao, Hao Lei, Xinyue Cui, Yi Huang, Junlan Feng, Weiran Xu
Most existing slot filling models tend to memorize inherent patterns of entities and corresponding contexts from training data.
1 code implementation • NAACL 2022 • Lulu Zhao, Fujia Zheng, Weihao Zeng, Keqing He, Weiran Xu, Huixing Jiang, Wei Wu, Yanan Wu
The most advanced abstractive dialogue summarizers lack generalization ability on new domains and the existing researches for domain adaptation in summarization generally rely on large-scale pre-trainings.
no code implementations • 25 Oct 2021 • Lulu Zhao, Fujia Zheng, Keqing He, Weihao Zeng, Yuejie Lei, Huixing Jiang, Wei Wu, Weiran Xu, Jun Guo, Fanyu Meng
Previous dialogue summarization datasets mainly focus on open-domain chitchat dialogues, while summarization datasets for the broadly used task-oriented dialogue haven't been explored yet.
1 code implementation • EMNLP 2021 • LiWen Wang, Xuefeng Li, Jiachi Liu, Keqing He, Yuanmeng Yan, Weiran Xu
Zero-shot cross-domain slot filling alleviates the data dependence in the case of data scarcity in the target domain, which has aroused extensive research.
1 code implementation • NAACL 2021 • LiWen Wang, Yuanmeng Yan, Keqing He, Yanan Wu, Weiran Xu
In this paper, we propose an adversarial disentangled debiasing model to dynamically decouple social bias attributes from the intermediate representations trained on the main task.
1 code implementation • NAACL 2021 • Zhiyuan Zeng, Keqing He, Yuanmeng Yan, Hong Xu, Weiran Xu
Detecting out-of-domain (OOD) intents is crucial for the deployed task-oriented dialogue system.
1 code implementation • ACL 2021 • Zhiyuan Zeng, Keqing He, Yuanmeng Yan, Zijun Liu, Yanan Wu, Hong Xu, Huixing Jiang, Weiran Xu
Detecting Out-of-Domain (OOD) or unknown intents from user queries is essential in a task-oriented dialog system.
1 code implementation • ACL 2021 • Yanan Wu, Zhiyuan Zeng, Keqing He, Hong Xu, Yuanmeng Yan, Huixing Jiang, Weiran Xu
Existing slot filling models can only recognize pre-defined in-domain slot types from a limited slot set.
no code implementations • COLING 2020 • Keqing He, Shuyu Lei, Yushu Yang, Huixing Jiang, Zhongyuan Wang
Slot filling and intent detection are two major tasks for spoken language understanding.
no code implementations • COLING 2020 • Hong Xu, Keqing He, Yuanmeng Yan, Sihong Liu, Zijun Liu, Weiran Xu
Detecting out-of-domain (OOD) input intents is critical in the task-oriented dialog system.
no code implementations • COLING 2020 • Keqing He, Jinchao Zhang, Yuanmeng Yan, Weiran Xu, Cheng Niu, Jie zhou
In this paper, we propose a Contrastive Zero-Shot Learning with Adversarial Attack (CZSL-Adv) method for the cross-domain slot filling.
no code implementations • ACL 2020 • Keqing He, Yuanmeng Yan, Weiran Xu
Neural-based context-aware models for slot tagging have achieved state-of-the-art performance.
1 code implementation • 25 Apr 2020 • Liwei Huang, Yutao Ma, Yanbo Liu, Keqing He
In particular, the DAN-SNR makes use of the self-attention mechanism instead of the architecture of recurrent neural networks to model sequential influence and social influence in a unified manner.