1 code implementation • EMNLP 2020 • Wanwei He, Min Yang, Rui Yan, Chengming Li, Ying Shen, Ruifeng Xu
Instead of adopting the classic student-teacher learning of forcing the output of a student network to exactly mimic the soft targets produced by the teacher networks, we introduce two discriminators as in generative adversarial network (GAN) to transfer knowledge from two teachers to the student.
Ranked #5 on Task-Oriented Dialogue Systems on KVRET
Generative Adversarial Network Task-Oriented Dialogue Systems
1 code implementation • 25 Mar 2024 • Feiteng Fang, Liang Zhu, Min Yang, Xi Feng, Jinchang Hou, Qixuan Zhao, Chengming Li, Xiping Hu, Ruifeng Xu
Reinforcement learning from human feedback (RLHF) is a crucial technique in aligning large language models (LLMs) with human preferences, ensuring these LLMs behave in beneficial and comprehensible ways to users.
1 code implementation • 26 Feb 2024 • Shiwen Ni, Minghuan Tan, Yuelin Bai, Fuqiang Niu, Min Yang, BoWen Zhang, Ruifeng Xu, Xiaojun Chen, Chengming Li, Xiping Hu, Ye Li, Jianping Fan
In this paper, we contribute a new benchmark, the first Multilingual-oriented quiZ on Intellectual Property (MoZIP), for the evaluation of LLMs in the IP domain.
no code implementations • 26 Feb 2024 • Shiwen Ni, Min Yang, Ruifeng Xu, Chengming Li, Xiping Hu
To solve the inconsistency between training and inference caused by the randomness of dropout, some studies use consistency training to regularize dropout at the output layer.
no code implementations • 10 Feb 2024 • Zhibo Chu, Shiwen Ni, Zichong Wang, Xi Feng, Chengming Li, Xiping Hu, Ruifeng Xu, Min Yang, Wenbin Zhang
Language models serve as a cornerstone in natural language processing (NLP), utilizing mathematical methods to generalize language laws and knowledge for prediction and generation.
1 code implementation • 29 Jan 2024 • Jinchang Hou, Chang Ao, Haihong Wu, Xiangtao Kong, Zhigang Zheng, Daijia Tang, Chengming Li, Xiping Hu, Ruifeng Xu, Shiwen Ni, Min Yang
The integration of LLMs and education is getting closer and closer, however, there is currently no benchmark for evaluating LLMs that focuses on the Chinese K-12 education domain.
no code implementations • 14 Nov 2023 • Shiwen Ni, Dingwei Chen, Chengming Li, Xiping Hu, Ruifeng Xu, Min Yang
In this paper, we propose a new paradigm for fine-tuning called F-Learning (Forgetting before Learning), which employs parametric arithmetic to facilitate the forgetting of old knowledge and learning of new knowledge.
1 code implementation • 24 Oct 2023 • Jing Xiong, Chengming Li, Min Yang, Xiping Hu, Bin Hu
To this end, we design an Expression Syntax Information Bottleneck method for MWP (called ESIB) based on variational information bottleneck, which extracts essential features of expression syntax tree while filtering latent-specific redundancy containing syntax-irrelevant features.
1 code implementation • 4 Oct 2023 • Jing Xiong, Zixuan Li, Chuanyang Zheng, Zhijiang Guo, Yichun Yin, Enze Xie, Zhicheng Yang, Qingxing Cao, Haiming Wang, Xiongwei Han, Jing Tang, Chengming Li, Xiaodan Liang
Dual Queries first query LLM to obtain LLM-generated knowledge such as CoT, then query the retriever to obtain the final exemplars via both question and the knowledge.
1 code implementation • Findings of the Association for Computational Linguistics: EMNLP 2022 2022 • Yunshui Li, Junhao Liu, Chengming Li, Min Yang
In this paper, we propose a selfdistillation framework with meta learning(MetaSD) for knowledge graph completion with dynamic pruning, which aims to learn compressed graph embeddings and tackle the longtail samples.
Ranked #4 on Link Prediction on FB15k-237
no code implementations • 27 Oct 2022 • Jing Xiong, Zhongwei Wan, Xiping Hu, Min Yang, Chengming Li
Specifically, we firstly obtain a sub-network by pruning a roberta2tree model, for the sake to use the gap on output distribution between the original roberta2tree model and the pruned sub-network to expose spurious correlative samples.
no code implementations • 4 Jan 2022 • Jingjing Yang, Haifeng Lu, Chengming Li, Xiping Hu, Bin Hu
Gait analysis provides a non-contact, low-cost, and efficient early screening method for depression.
no code implementations • 15 Jul 2021 • Lei Chen, Fajie Yuan, Jiaxi Yang, Min Yang, Chengming Li
To realize such a goal, we propose AdaRec, a knowledge distillation (KD) framework which compresses knowledge of a teacher model into a student model adaptively according to its recommendation scene by using differentiable Neural Architecture Search (NAS).
no code implementations • 20 Jun 2021 • Yijiang Li, Wentian Cai, Ying Gao, Chengming Li, Xiping Hu
The local and detailed feature from the shallower layer such as boundary and tissue texture is particularly more important in medical segmentation compared with natural image segmentation.
no code implementations • 15 Jun 2021 • Lei Chen, Fajie Yuan, Jiaxi Yang, Xiangnan He, Chengming Li, Min Yang
Fine-tuning works as an effective transfer learning technique for this objective, which adapts the parameters of a pre-trained model from the source domain to the target domain.
no code implementations • The Thirty-Fifth AAAI Conference on Artificial Intelligence 2021 • Chunpu Xu, Min Yang, Chengming Li, Ying Shen, Xiang Ao, and Ruifeng Xu
Finally, we integrate the imaginary concepts and relational knowledge to generate human-like story based on the original semantics of images.
Ranked #2 on Visual Storytelling on VIST
no code implementations • COLING 2020 • Chunpu Xu, Yu Li, Chengming Li, Xiang Ao, Min Yang, Jinwen Tian
In this paper, we propose an Interactive key-value Memory- augmented Attention model for image Paragraph captioning (IMAP) to keep track of the attention history (salient objects coverage information) along with the update-chain of the decoder state and therefore avoid generating repetitive or incomplete image descriptions.