1 code implementation • 19 Jan 2024 • Peiwen Yuan, Xinglin Wang, Shaoxiong Feng, Boyuan Pan, Yiwei Li, HeDa Wang, Xupeng Miao, Kan Li
Memorizing-free matching mechanism from Dense Retrieval (DR) is then introduced to conduct fine-grained intra-cluster matching from clusters to relevant documents.
1 code implementation • 19 Jan 2024 • Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Xinglin Wang, Bin Sun, HeDa Wang, Kan Li
Self-consistency (SC) has been a widely used decoding strategy for chain-of-thought reasoning.
1 code implementation • 31 Dec 2023 • Peiwen Yuan, Shaoxiong Feng, Yiwei Li, Xinglin Wang, Boyuan Pan, HeDa Wang, Kan Li
Significant progress has been made in automatic text evaluation with the introduction of large language models (LLMs) as evaluators.
1 code implementation • 20 Dec 2023 • Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Bin Sun, Xinglin Wang, HeDa Wang, Kan Li
In this work, we illustrate the merit of negative data and propose a model specialization framework to distill LLMs with negative samples besides positive ones.
no code implementations • 21 Mar 2023 • Yiwei Li, Shaoxiong Feng, Bin Sun, Kan Li
Collaborative learning, also known as online knowledge distillation, is an effective way to conduct one-stage group distillation in the absence of a well-trained large teacher model.
no code implementations • 1 Dec 2022 • Bin Sun, Shaoxiong Feng, Yiwei Li, Weichao Wang, Fei Mi, Yitong Li, Kan Li
Complex dialogue mappings (CDM), including one-to-many and many-to-one mappings, tend to make dialogue models generate incoherent or dull responses, and modeling these mappings remains a huge challenge for neural dialogue systems.
no code implementations • 23 May 2022 • Yiwei Li, Bin Sun, Shaoxiong Feng, Kan Li
However, the discarded samples may obtain high scores in other perspectives and can provide regularization effects on the model learning, which causes the performance improvement to be sensitive to the filtering ratio.
no code implementations • NAACL 2022 • Yiwei Li, Shaoxiong Feng, Bin Sun, Kan Li
Generative dialogue models suffer badly from the generic response problem, limiting their applications to a few toy scenarios.
no code implementations • Findings (ACL) 2022 • Shaoxiong Feng, Xuancheng Ren, Kan Li, Xu sun
However, for the continual increase of online chit-chat scenarios, directly fine-tuning these models for each of the new tasks not only explodes the capacity of the dialogue system on the embedded devices but also causes knowledge forgetting on pre-trained models and knowledge interference among diverse dialogue tasks.
no code implementations • ACL 2021 • Bin Sun, Shaoxiong Feng, Yiwei Li, Jiamou Liu, Kan Li
Conditional Variational AutoEncoder (CVAE) effectively increases the diversity and informativeness of responses in open-ended dialogue generation tasks through enriching the context vector with sampled latent variables.
no code implementations • 28 May 2021 • Bin Sun, Shaoxiong Feng, Yiwei Li, Jiamou Liu, Kan Li
In this work, we proposed a conversation model named "THINK" (Teamwork generation Hover around Impressive Noticeable Keywords) to make the decoder more complicated and avoid generating duplicated and self-contradicting responses.
no code implementations • 22 Feb 2021 • Shaoxiong Feng, Xuancheng Ren, Kan Li, Xu sun
The finding of general knowledge is further hindered by the unidirectional distillation, as the student should obey the teacher and may discard some knowledge that is truly general but refuted by the teacher.
no code implementations • EMNLP 2020 • Shaoxiong Feng, Xuancheng Ren, Hongshen Chen, Bin Sun, Kan Li, Xu sun
Human dialogues are scenario-based and appropriate responses generally relate to the latent context knowledge entailed by the specific scenario.
no code implementations • 16 Sep 2020 • Shaoxiong Feng, Hongshen Chen, Xuancheng Ren, Zhuoye Ding, Kan Li, Xu sun
Collaborative learning has successfully applied knowledge transfer to guide a pool of small student networks towards robust local minima.
no code implementations • 4 Mar 2020 • Shaoxiong Feng, Hongshen Chen, Kan Li, Dawei Yin
Neural conversational models learn to generate responses by taking into account the dialog history.