no code implementations • 10 Apr 2025 • Yichun Yin, Wenyong Huang, Kaikai Song, Yehui Tang, Xueyu Wu, Wei Guo, Peng Guo, Yaoyuan Wang, Xiaojun Meng, Yasheng Wang, Dong Li, Can Chen, Dandan Tu, Yin Li, Fisher Yu, Ruiming Tang, Yunhe Wang, Baojun Wang, Bin Wang, Bo wang, Boxiao Liu, Changzheng Zhang, Duyu Tang, Fei Mi, Hui Jin, Jiansheng Wei, Jiarui Qin, Jinpeng Li, Jun Zhao, Liqun Deng, Lin Li, Minghui Xu, Naifu Zhang, Nianzu Zheng, Qiang Li, Rongju Ruan, Shengjun Cheng, Tianyu Guo, wei he, Wei Li, Weiwen Liu, Wulong Liu, Xinyi Dai, Yonghan Dong, Yu Pan, Yue Li, YuFei Wang, YuJun Li, Yunsheng Ni, Zhe Liu, Zhenhe Zhang, Zhicheng Liu
Our exploration demonstrates that Ascend NPUs are capable of efficiently and effectively training dense models with more than 100 billion parameters.
no code implementations • 12 Mar 2025 • Boyang Xue, Qi Zhu, Hongru Wang, Rui Wang, Sheng Wang, Hongling Xu, Fei Mi, Yasheng Wang, Lifeng Shang, Qun Liu, Kam-Fai Wong
Present Large Language Models (LLM) self-training methods always under-sample on challenging queries, leading to inadequate learning on difficult problems which limits LLMs' ability.
1 code implementation • 16 Dec 2024 • Boyang Xue, Fei Mi, Qi Zhu, Hongru Wang, Rui Wang, Sheng Wang, Erxin Yu, Xuming Hu, Kam-Fai Wong
Despite demonstrating impressive capabilities, Large Language Models (LLMs) still often struggle to accurately express the factual knowledge they possess, especially in cases where the LLMs' knowledge boundaries are ambiguous.
1 code implementation • 25 Jun 2024 • Erxin Yu, Jing Li, Ming Liao, Siqi Wang, Zuchen Gao, Fei Mi, Lanqing Hong
To the best of our knowledge, we are the first to study LLM safety in multi-turn dialogue coreference.
no code implementations • 12 Jun 2024 • Yiwei Li, Fei Mi, Yitong Li, Yasheng Wang, Bin Sun, Shaoxiong Feng, Kan Li
In DDS, both sequence-level and token-level adaptive search can be achieved to adjust the decoding process in a unified framework.
no code implementations • 1 May 2024 • Zhili Liu, Yunhao Gou, Kai Chen, Lanqing Hong, Jiahui Gao, Fei Mi, Yu Zhang, Zhenguo Li, Xin Jiang, Qun Liu, James T. Kwok
As the capabilities of large language models (LLMs) have expanded dramatically, aligning these models with human values presents a significant challenge.
no code implementations • 5 Mar 2024 • Rui Wang, Fei Mi, Yi Chen, Boyang Xue, Hongru Wang, Qi Zhu, Kam-Fai Wong, Ruifeng Xu
2) Role Prompting assigns a central prompt to the general domain and a unique role prompt to each specific domain to minimize inter-domain confusion during training.
no code implementations • 26 Feb 2024 • Hongru Wang, Boyang Xue, Baohang Zhou, Rui Wang, Fei Mi, Weichao Wang, Yasheng Wang, Kam-Fai Wong
Conversational retrieval refers to an information retrieval system that operates in an iterative and interactive manner, requiring the retrieval of various external resources, such as persona, knowledge, and even response, to effectively engage with the user and successfully complete the dialogue.
no code implementations • 28 Jan 2024 • Jianqiao Lu, Wanjun Zhong, YuFei Wang, Zhijiang Guo, Qi Zhu, Wenyong Huang, Yanlin Wang, Fei Mi, Baojun Wang, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu
With the teacher's guidance, the student learns to iteratively refine its answer with feedback, and forms a robust and comprehensive understanding of the posed questions.
no code implementations • 24 Jan 2024 • Hongru Wang, WenYu Huang, Yang Deng, Rui Wang, Zezhong Wang, YuFei Wang, Fei Mi, Jeff Z. Pan, Kam-Fai Wong
To better plan and incorporate the use of multiple sources in generating personalized response, we firstly decompose it into three sub-tasks: Knowledge Source Selection, Knowledge Retrieval, and Response Generation.
1 code implementation • 4 Dec 2023 • Zige Wang, Wanjun Zhong, YuFei Wang, Qi Zhu, Fei Mi, Baojun Wang, Lifeng Shang, Xin Jiang, Qun Liu
This survey aims to provide a comprehensive overview of current research in data management within both the pretraining and supervised fine-tuning stages of LLMs, covering various aspects of data management strategy design.
1 code implementation • 15 Nov 2023 • Zhexin Zhang, Junxiao Yang, Pei Ke, Fei Mi, Hongning Wang, Minlie Huang
While significant attention has been dedicated to exploiting weaknesses in LLMs through jailbreaking attacks, there remains a paucity of effort in defending against these attacks.
1 code implementation • 31 Oct 2023 • Yuxin Jiang, YuFei Wang, Xingshan Zeng, Wanjun Zhong, Liangyou Li, Fei Mi, Lifeng Shang, Xin Jiang, Qun Liu, Wei Wang
To fill this research gap, in this paper, we propose FollowBench, a Multi-level Fine-grained Constraints Following Benchmark for LLMs.
no code implementations • 16 Oct 2023 • Kai Chen, Chunwei Wang, Kuo Yang, Jianhua Han, Lanqing Hong, Fei Mi, Hang Xu, Zhengying Liu, Wenyong Huang, Zhenguo Li, Dit-yan Yeung, Lifeng Shang, Xin Jiang, Qun Liu
The rapid development of large language models (LLMs) has not only provided numerous opportunities but also presented significant challenges.
no code implementations • 13 Oct 2023 • Hongru Wang, Minda Hu, Yang Deng, Rui Wang, Fei Mi, Weichao Wang, Yasheng Wang, Wai-Chung Kwan, Irwin King, Kam-Fai Wong
Open-domain dialogue system usually requires different sources of knowledge to generate more informative and evidential responses.
1 code implementation • 12 Oct 2023 • Boyang Xue, Weichao Wang, Hongru Wang, Fei Mi, Rui Wang, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu, Kam-Fai Wong
Inspired by previous work which identified that feed-forward networks (FFNs) within Transformers are responsible for factual knowledge expressions, we investigate two methods to efficiently improve the factual expression capability {of FFNs} by knowledge enhancement and alignment respectively.
no code implementations • 1 Oct 2023 • Jianqiao Lu, Wanjun Zhong, Wenyong Huang, YuFei Wang, Qi Zhu, Fei Mi, Baojun Wang, Weichao Wang, Xingshan Zeng, Lifeng Shang, Xin Jiang, Qun Liu
SELF initiates with a meta-skill learning process that equips the LLMs with capabilities for self-feedback and self-refinement.
no code implementations • 28 Sep 2023 • Hongru Wang, Huimin Wang, Lingzhi Wang, Minda Hu, Rui Wang, Boyang Xue, Hongyuan Lu, Fei Mi, Kam-Fai Wong
Large language models (LLMs) have demonstrated exceptional performance in planning the use of various functional tools, such as calculators and retrievers, particularly in question-answering tasks.
1 code implementation • 24 Jul 2023 • YuFei Wang, Wanjun Zhong, Liangyou Li, Fei Mi, Xingshan Zeng, Wenyong Huang, Lifeng Shang, Xin Jiang, Qun Liu
(2) Training methodologies: a detailed review of the prevailing training methods employed for LLM alignment.
1 code implementation • 13 Jul 2023 • Pei Ke, Fei Huang, Fei Mi, Yasheng Wang, Qun Liu, Xiaoyan Zhu, Minlie Huang
Existing evaluation metrics for natural language generation (NLG) tasks face the challenges on generalization ability and interpretability.
1 code implementation • 23 May 2023 • Haoqin Tu, Yitong Li, Fei Mi, Zhongliang Yang
To demonstrate the superiority and universality of the provided visual knowledge, we propose a simple but effective framework ReSee to add visual representation into vanilla dialogue models by modality concatenations.
1 code implementation • 23 May 2023 • Rui Wang, Hongru Wang, Fei Mi, Yi Chen, Boyang Xue, Kam-Fai Wong, Ruifeng Xu
Numerous works are proposed to align large language models (LLMs) with human intents to better fulfill instructions, ensuring they are trustful and helpful.
2 code implementations • 19 May 2023 • Hongru Wang, Rui Wang, Fei Mi, Yang Deng, Zezhong Wang, Bin Liang, Ruifeng Xu, Kam-Fai Wong
Large Language Models (LLMs), such as \texttt{ChatGPT}, greatly empower dialogue systems with strong language understanding and generation capabilities.
1 code implementation • 21 Dec 2022 • Hao Sun, Zhexin Zhang, Fei Mi, Yasheng Wang, Wei Liu, Jianwei Cui, Bin Wang, Qun Liu, Minlie Huang
In this paper, we propose a framework, MoralDial to train and evaluate moral dialogue systems.
no code implementations • 4 Dec 2022 • Qi Zhu, Fei Mi, Zheng Zhang, Yasheng Wang, Yitong Li, Xin Jiang, Qun Liu, Xiaoyan Zhu, Minlie Huang
For the former, the grounding knowledge consists of keywords extracted from the response.
1 code implementation • 4 Dec 2022 • Zhexin Zhang, Jiale Cheng, Hao Sun, Jiawen Deng, Fei Mi, Yasheng Wang, Lifeng Shang, Minlie Huang
In order to detect such toxic generations, existing methods rely on templates, real-world data extraction, crowdsourcing workers, or automatic generation to construct adversarial contexts that are likely to induce toxic generations.
no code implementations • 2 Dec 2022 • Bin Sun, Yitong Li, Fei Mi, Weichao Wang, Yiwei Li, Kan Li
Specifically, HLV constrains the global semantics of responses through discrete latent variables and enriches responses with continuous latent variables.
no code implementations • 1 Dec 2022 • Bin Sun, Shaoxiong Feng, Yiwei Li, Weichao Wang, Fei Mi, Yitong Li, Kan Li
Complex dialogue mappings (CDM), including one-to-many and many-to-one mappings, tend to make dialogue models generate incoherent or dull responses, and modeling these mappings remains a huge challenge for neural dialogue systems.
no code implementations • 1 Sep 2022 • Jiatong Li, Bin He, Fei Mi
In order to expand the information that PLMs can utilize, we encode topic and dialogue history information using certain prompts with multiple channels of Fusion-in-Decoder (FiD) and explore the influence of three different channel settings.
2 code implementations • 31 Mar 2022 • Fei Mi, Yitong Li, Yulong Zeng, Jingyan Zhou, Yasheng Wang, Chuanfei Xu, Lifeng Shang, Xin Jiang, Shiqi Zhao, Qun Liu
We investigate different aspects of responses generated by PanGu-Bot, including response quality, knowledge, and safety.
1 code implementation • ACL 2022 • Qi Zhu, Bing Li, Fei Mi, Xiaoyan Zhu, Minlie Huang
A desirable dialog system should be able to continually learn new skills without forgetting old ones, and thereby adapt to new domains or tasks in its life cycle.
no code implementations • Findings (ACL) 2022 • Xin Wang, Yasheng Wang, Yao Wan, Fei Mi, Yitong Li, Pingyi Zhou, Jin Liu, Hao Wu, Xin Jiang, Qun Liu
Automatically generating compilable programs with (or without) natural language descriptions has always been a touchstone problem for computational linguistics and automated software engineering.
no code implementations • 16 Feb 2022 • Jingyan Zhou, Jiawen Deng, Fei Mi, Yitong Li, Yasheng Wang, Minlie Huang, Xin Jiang, Qun Liu, Helen Meng
The research of open-domain dialog systems has been greatly prospered by neural models trained on large-scale corpora, however, such corpora often introduce various safety problems (e. g., offensive languages, biases, and toxic behaviors) that significantly hinder the deployment of dialog systems in practice.
no code implementations • COLING 2022 • Yihe Wang, Yitong Li, Yasheng Wang, Fei Mi, Pingyi Zhou, Xin Wang, Jin Liu, Xin Jiang, Qun Liu
Experiments over publicly available datasets demonstrate that our method can help models generate better responses, even such training data are usually impressed as low-quality data.
1 code implementation • 16 Jan 2022 • Jiawen Deng, Jingyan Zhou, Hao Sun, Chujie Zheng, Fei Mi, Helen Meng, Minlie Huang
To this end, we propose a benchmark --COLD for Chinese offensive language analysis, including a Chinese Offensive Language Dataset --COLDATASET and a baseline detector --COLDETECTOR which is trained on the dataset.
no code implementations • Findings (NAACL) 2022 • Mengjie Zhao, Fei Mi, Yasheng Wang, Minglei Li, Xin Jiang, Qun Liu, Hinrich Schütze
We propose LMTurk, a novel approach that treats few-shot learners as crowdsourcing workers.
no code implementations • dialdoc (ACL) 2022 • Xinyan Zhao, Bin He, Yasheng Wang, Yitong Li, Fei Mi, Yajiao Liu, Xin Jiang, Qun Liu, Huanhuan Chen
With the advances in deep learning, tremendous progress has been made with chit-chat dialogue systems and task-oriented dialogue systems.
no code implementations • 10 Sep 2021 • Fei Mi, Yitong Li, Yasheng Wang, Xin Jiang, Qun Liu
As labeling cost for different modules in task-oriented dialog (ToD) systems is high, a major challenge in practice is to learn different tasks with the least amount of labeled data.
no code implementations • 28 Aug 2021 • Fei Mi, Tao Lin, Boi Faltings
In this paper, we consider scenarios that require learning new classes or data distributions quickly and incrementally over time, as it often occurs in real-world dynamic environments.
1 code implementation • EMNLP 2021 • Fei Mi, Wanhao Zhou, Fengyu Cai, Lingjing Kong, Minlie Huang, Boi Faltings
In this paper, we devise a self-training approach to utilize the abundant unlabeled dialog data to further improve state-of-the-art pre-trained models in few-shot learning scenarios for ToD systems.
1 code implementation • 26 Aug 2021 • Fengyu Cai, Wanhao Zhou, Fei Mi, Boi Faltings
Utterance-level intent detection and token-level slot filling are two key tasks for natural language understanding (NLU) in task-oriented systems.
Ranked #3 on
Slot Filling
on MixSNIPS
no code implementations • 10 Aug 2021 • Xin Wang, Yasheng Wang, Fei Mi, Pingyi Zhou, Yao Wan, Xiao Liu, Li Li, Hao Wu, Jin Liu, Xin Jiang
Code representation learning, which aims to encode the semantics of source code into distributed vectors, plays an important role in recent deep-learning-based models for code intelligence.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Fei Mi, LiangWei Chen, Mengjie Zhao, Minlie Huang, Boi Faltings
Natural language generation (NLG) is an essential component of task-oriented dialog systems.
1 code implementation • 23 Jul 2020 • Fei Mi, Xiaoyu Lin, Boi Faltings
In this case, the recommender is updated continually and periodically with new data that arrives in each update cycle, and the updated model needs to provide recommendations for user activities before the next model update.
no code implementations • 28 Apr 2020 • Fei Mi, Boi Faltings
We empirically show that MAN is well-suited for the incremental SR task, and it consistently outperforms state-of-the-art neural and nonparametric methods.
no code implementations • EMNLP 2020 • Mengjie Zhao, Tao Lin, Fei Mi, Martin Jaggi, Hinrich Schütze
We present an efficient method of utilizing pretrained language models, where we learn selective binary masks for pretrained weights in lieu of modifying them through finetuning.
no code implementations • 14 May 2019 • Fei Mi, Minlie Huang, Jiyong Zhang, Boi Faltings
Natural language generation (NLG) is an essential component of task-oriented dialogue systems.
1 code implementation • 10 Jun 2018 • Fei Mi, Boi Faltings
Therefore, recommendations need to be adaptive to such frequent changes.
Information Retrieval
1 code implementation • 22 Jun 2017 • Chaitanya K. Joshi, Fei Mi, Boi Faltings
The main goal of modeling human conversation is to create agents which can interact with people in both open-ended and goal-oriented scenarios.