Search Results for author: Chengming Li

Found 28 papers, 12 papers with code

Amalgamating Knowledge from Two Teachers for Task-oriented Dialogue System with Adversarial Training

1 code implementation EMNLP 2020 Wanwei He, Min Yang, Rui Yan, Chengming Li, Ying Shen, Ruifeng Xu

Instead of adopting the classic student-teacher learning of forcing the output of a student network to exactly mimic the soft targets produced by the teacher networks, we introduce two discriminators as in generative adversarial network (GAN) to transfer knowledge from two teachers to the student.

Generative Adversarial Network Task-Oriented Dialogue Systems

UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation

no code implementations3 Oct 2024 Zixuan Li, Jing Xiong, Fanghua Ye, Chuanyang Zheng, Xun Wu, Jianqiao Lu, Zhongwei Wan, Xiaodan Liang, Chengming Li, Zhenan Sun, Lingpeng Kong, Ngai Wong

We present UncertaintyRAG, a novel approach for long-context Retrieval-Augmented Generation (RAG) that utilizes Signal-to-Noise Ratio (SNR)-based span uncertainty to estimate similarity between text chunks.

Chunking Language Modelling +2

PersonaMath: Enhancing Math Reasoning through Persona-Driven Data Augmentation

no code implementations2 Oct 2024 Jing Luo, Run Luo, Longze Chen, Liang Zhu, Chang Ao, Jiaming Li, Yukun Chen, Xin Cheng, Wen Yang, Jiayuan Su, Chengming Li, Min Yang

To bridge this gap, we propose a data augmentation approach and introduce PersonaMathQA, a dataset derived from MATH and GSM8K, on which we train the PersonaMath models.

Data Augmentation Diversity +3

Learning to Generalize Unseen Domains via Multi-Source Meta Learning for Text Classification

no code implementations20 Sep 2024 Yuxuan Hu, Chenwei Zhang, Min Yang, Xiaodan Liang, Chengming Li, Xiping Hu

In this paper, we study the multi-source Domain Generalization of text classification and propose a framework to use multiple seen domains to train a model that can achieve high accuracy in an unseen domain.

Domain Generalization Meta-Learning +2

Training on the Benchmark Is Not All You Need

1 code implementation3 Sep 2024 Shiwen Ni, Xiangtao Kong, Chengming Li, Xiping Hu, Ruifeng Xu, Jia Zhu, Min Yang

The success of Large Language Models (LLMs) relies heavily on the huge amount of pre-training data learned in the pre-training phase.

Multiple-choice

Fine-Tuning and Deploying Large Language Models Over Edges: Issues and Approaches

no code implementations20 Aug 2024 Yanjie Dong, Haijun Zhang, Chengming Li, Song Guo, Victor C. M. Leung, Xiping Hu

Additionally, large-scale foundation models have expanded to create images, audio, videos, and multi-modal contents, further emphasizing the need for efficient deployment.

Model Compression

Lower Layer Matters: Alleviating Hallucination via Multi-Layer Fusion Contrastive Decoding with Truthfulness Refocused

no code implementations16 Aug 2024 Dingwei Chen, Feiteng Fang, Shiwen Ni, Feng Liang, Ruifeng Xu, Min Yang, Chengming Li

Large Language Models (LLMs) have demonstrated exceptional performance across various natural language processing tasks, yet they occasionally tend to yield content that factually inaccurate or discordant with the expected output, a phenomenon empirically referred to as "hallucination".

Hallucination TruthfulQA

AgentCourt: Simulating Court with Adversarial Evolvable Lawyer Agents

1 code implementation15 Aug 2024 Guhong Chen, Liyang Fan, Zihan Gong, Nan Xie, Zixuan Li, Ziqiang Liu, Chengming Li, Qiang Qu, Shiwen Ni, Min Yang

Our core goal is to enable lawyer agents to learn how to argue a case, as well as improving their overall legal skills, through courtroom process simulation.

APTNESS: Incorporating Appraisal Theory and Emotion Support Strategies for Empathetic Response Generation

1 code implementation23 Jul 2024 Yuxuan Hu, Minghuan Tan, Chenwei Zhang, Zixuan Li, Xiaodan Liang, Min Yang, Chengming Li, Xiping Hu

By incorporating emotional support strategies, we aim to enrich the model's capabilities in both cognitive and affective empathy, leading to a more nuanced and comprehensive empathetic response.

Empathetic Response Generation Response Generation +2

Resource Allocation and Workload Scheduling for Large-Scale Distributed Deep Learning: A Survey

no code implementations12 Jun 2024 Feng Liang, Zhen Zhang, Haifeng Lu, Chengming Li, Victor C. M. Leung, Yanyi Guo, Xiping Hu

The large-scale environment with large volumes of datasets, models, and computational and communication resources raises various unique challenges for resource allocation and workload scheduling in distributed deep learning, such as scheduling complexity, resource and workload heterogeneity, and fault tolerance.

Deep Learning Scheduling +1

CPsyCoun: A Report-based Multi-turn Dialogue Reconstruction and Evaluation Framework for Chinese Psychological Counseling

2 code implementations26 May 2024 Chenhao Zhang, Renhao Li, Minghuan Tan, Min Yang, Jingwei Zhu, Di Yang, Jiahao Zhao, Guancheng Ye, Chengming Li, Xiping Hu

To bridge the gap, we propose CPsyCoun, a report-based multi-turn dialogue reconstruction and evaluation framework for Chinese psychological counseling.

CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations

1 code implementation16 May 2024 Jiahao Zhao, Jingwei Zhu, Minghuan Tan, Min Yang, Di Yang, Chenhao Zhang, Guancheng Ye, Chengming Li, Xiping Hu

In this paper, we introduce a novel psychological benchmark, CPsyExam, constructed from questions sourced from Chinese language examinations.

4k

CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment

1 code implementation25 Mar 2024 Feiteng Fang, Liang Zhu, Min Yang, Xi Feng, Jinchang Hou, Qixuan Zhao, Chengming Li, Xiping Hu, Ruifeng Xu

Reinforcement learning from human feedback (RLHF) is a crucial technique in aligning large language models (LLMs) with human preferences, ensuring these LLMs behave in beneficial and comprehensible ways to users.

Contrastive Learning reinforcement-learning +1

MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property

1 code implementation26 Feb 2024 Shiwen Ni, Minghuan Tan, Yuelin Bai, Fuqiang Niu, Min Yang, BoWen Zhang, Ruifeng Xu, Xiaojun Chen, Chengming Li, Xiping Hu, Ye Li, Jianping Fan

In this paper, we contribute a new benchmark, the first Multilingual-oriented quiZ on Intellectual Property (MoZIP), for the evaluation of LLMs in the IP domain.

Language Modelling Large Language Model +2

Layer-wise Regularized Dropout for Neural Language Models

no code implementations26 Feb 2024 Shiwen Ni, Min Yang, Ruifeng Xu, Chengming Li, Xiping Hu

To solve the inconsistency between training and inference caused by the randomness of dropout, some studies use consistency training to regularize dropout at the output layer.

Abstractive Text Summarization Machine Translation +1

E-EVAL: A Comprehensive Chinese K-12 Education Evaluation Benchmark for Large Language Models

1 code implementation29 Jan 2024 Jinchang Hou, Chang Ao, Haihong Wu, Xiangtao Kong, Zhigang Zheng, Daijia Tang, Chengming Li, Xiping Hu, Ruifeng Xu, Shiwen Ni, Min Yang

The integration of LLMs and education is getting closer and closer, however, there is currently no benchmark for evaluating LLMs that focuses on the Chinese K-12 education domain.

Ethics Multiple-choice

Forgetting before Learning: Utilizing Parametric Arithmetic for Knowledge Updating in Large Language Models

no code implementations14 Nov 2023 Shiwen Ni, Dingwei Chen, Chengming Li, Xiping Hu, Ruifeng Xu, Min Yang

In this paper, we propose a new paradigm for fine-tuning called F-Learning (Forgetting before Learning), which employs parametric arithmetic to facilitate the forgetting of old knowledge and learning of new knowledge.

Expression Syntax Information Bottleneck for Math Word Problems

1 code implementation24 Oct 2023 Jing Xiong, Chengming Li, Min Yang, Xiping Hu, Bin Hu

To this end, we design an Expression Syntax Information Bottleneck method for MWP (called ESIB) based on variational information bottleneck, which extracts essential features of expression syntax tree while filtering latent-specific redundancy containing syntax-irrelevant features.

Math

DQ-LoRe: Dual Queries with Low Rank Approximation Re-ranking for In-Context Learning

1 code implementation4 Oct 2023 Jing Xiong, Zixuan Li, Chuanyang Zheng, Zhijiang Guo, Yichun Yin, Enze Xie, Zhicheng Yang, Qingxing Cao, Haiming Wang, Xiongwei Han, Jing Tang, Chengming Li, Xiaodan Liang

Dual Queries first query LLM to obtain LLM-generated knowledge such as CoT, then query the retriever to obtain the final exemplars via both question and the knowledge.

Dimensionality Reduction In-Context Learning +1

Self-Distillation with Meta Learning for Knowledge Graph Completion

1 code implementation Findings of the Association for Computational Linguistics: EMNLP 2022 2022 Yunshui Li, Junhao Liu, Chengming Li, Min Yang

In this paper, we propose a selfdistillation framework with meta learning(MetaSD) for knowledge graph completion with dynamic pruning, which aims to learn compressed graph embeddings and tackle the longtail samples.

Knowledge Graph Completion Meta-Learning +1

Self-consistent Reasoning For Solving Math Word Problems

no code implementations27 Oct 2022 Jing Xiong, Zhongwei Wan, Xiping Hu, Min Yang, Chengming Li

Specifically, we firstly obtain a sub-network by pruning a roberta2tree model, for the sake to use the gap on output distribution between the original roberta2tree model and the pruned sub-network to expose spurious correlative samples.

Math

Scene-adaptive Knowledge Distillation for Sequential Recommendation via Differentiable Architecture Search

no code implementations15 Jul 2021 Lei Chen, Fajie Yuan, Jiaxi Yang, Min Yang, Chengming Li

To realize such a goal, we propose AdaRec, a knowledge distillation (KD) framework which compresses knowledge of a teacher model into a student model adaptively according to its recommendation scene by using differentiable Neural Architecture Search (NAS).

Knowledge Distillation Neural Architecture Search +1

More than Encoder: Introducing Transformer Decoder to Upsample

no code implementations20 Jun 2021 Yijiang Li, Wentian Cai, Ying Gao, Chengming Li, Xiping Hu

The local and detailed feature from the shallower layer such as boundary and tissue texture is particularly more important in medical segmentation compared with natural image segmentation.

Decoder Image Segmentation +4

User-specific Adaptive Fine-tuning for Cross-domain Recommendations

no code implementations15 Jun 2021 Lei Chen, Fajie Yuan, Jiaxi Yang, Xiangnan He, Chengming Li, Min Yang

Fine-tuning works as an effective transfer learning technique for this objective, which adapts the parameters of a pre-trained model from the source domain to the target domain.

Recommendation Systems Transfer Learning

Interactive Key-Value Memory-augmented Attention for Image Paragraph Captioning

no code implementations COLING 2020 Chunpu Xu, Yu Li, Chengming Li, Xiang Ao, Min Yang, Jinwen Tian

In this paper, we propose an Interactive key-value Memory- augmented Attention model for image Paragraph captioning (IMAP) to keep track of the attention history (salient objects coverage information) along with the update-chain of the decoder state and therefore avoid generating repetitive or incomplete image descriptions.

Decoder Image Paragraph Captioning

Cannot find the paper you are looking for? You can Submit a new open access paper.