Search Results for author: Fanqi Wan

Found 16 papers, 13 papers with code

FuseRL: Dense Preference Optimization for Heterogeneous Model Fusion

no code implementations9 Apr 2025 Longguang Zhong, Fanqi Wan, ZiYi Yang, Guosheng Liang, Tianyuan Shi, Xiaojun Quan

Heterogeneous model fusion enhances the performance of LLMs by integrating the knowledge and capabilities of multiple structurally diverse models.

FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion

1 code implementation6 Mar 2025 ZiYi Yang, Fanqi Wan, Longguang Zhong, Canbin Huang, Guosheng Liang, Xiaojun Quan

The FuseChat-3. 0 training pipeline consists of two key stages: (1) supervised fine-tuning (SFT) to align the target and source model distributions, and (2) Direct Preference Optimization (DPO) to apply preferences from multiple source LLMs to fine-tune the target model.

General Knowledge Instruction Following +1

Advantage-Guided Distillation for Preference Alignment in Small Language Models

1 code implementation25 Feb 2025 Shiping Gao, Fanqi Wan, Jiajian Guo, Xiaojun Quan, Qifan Wang

Instead of directly applying existing alignment techniques to SLMs, we propose to utilize a well-aligned teacher LLM to guide the alignment process for these models, thereby facilitating the transfer of the teacher's knowledge of human preferences to the student model.

Knowledge Distillation

Weighted-Reward Preference Optimization for Implicit Model Fusion

4 code implementations4 Dec 2024 ZiYi Yang, Fanqi Wan, Longguang Zhong, Tianyuan Shi, Xiaojun Quan

To address distributional deviations between the source and target LLMs, WRPO introduces a progressive adaptation strategy that gradually shifts reliance on preferred examples from the target LLM to the source LLMs.

model

FuseChat: Knowledge Fusion of Chat Models

3 code implementations15 Aug 2024 Fanqi Wan, Longguang Zhong, ZiYi Yang, Ruijun Chen, Xiaojun Quan

In this work, we propose a new framework for the knowledge fusion of chat LLMs through two main stages, resulting in FuseChat.

Instruction Following

ProFuser: Progressive Fusion of Large Language Models

no code implementations9 Aug 2024 Tianyuan Shi, Fanqi Wan, Canbin Huang, Xiaojun Quan, Chenliang Li, Ming Yan, Ji Zhang

While fusing the capacities and advantages of various large language models (LLMs) offers a pathway to construct more powerful and versatile models, a fundamental challenge is to properly select advantageous model during the training.

8k

Self-Evolution Fine-Tuning for Policy Optimization

no code implementations16 Jun 2024 Ruijun Chen, Jiehao Liang, Shiping Gao, Fanqi Wan, Xiaojun Quan

In this paper, we introduce self-evolution fine-tuning (SEFT) for policy optimization, with the aim of eliminating the need for annotated samples while retaining the stability and efficiency of SFT.

BlockPruner: Fine-grained Pruning for Large Language Models

1 code implementation15 Jun 2024 Longguang Zhong, Fanqi Wan, Ruijun Chen, Xiaojun Quan, Liangzhi Li

While various layer pruning methods have been developed based on this insight, they generally overlook the finer-grained redundancies within the layers themselves.

Knowledge Fusion of Chat LLMs: A Preliminary Technical Report

2 code implementations25 Feb 2024 Fanqi Wan, ZiYi Yang, Longguang Zhong, Xiaojun Quan, Xinting Huang, Wei Bi

Recently, FuseLLM introduced the concept of knowledge fusion to transfer the collective knowledge of multiple structurally varied LLMs into a target LLM through lightweight continual training.

Knowledge Verification to Nip Hallucination in the Bud

1 code implementation19 Jan 2024 Fanqi Wan, Xinting Huang, Leyang Cui, Xiaojun Quan, Wei Bi, Shuming Shi

While large language models (LLMs) have demonstrated exceptional performance across various tasks following human alignment, they may still generate responses that sound plausible but contradict factual knowledge, a phenomenon known as hallucination.

Hallucination World Knowledge

Knowledge Fusion of Large Language Models

3 code implementations19 Jan 2024 Fanqi Wan, Xinting Huang, Deng Cai, Xiaojun Quan, Wei Bi, Shuming Shi

In this paper, we introduce the notion of knowledge fusion for LLMs, aimed at combining the capabilities of existing LLMs and transferring them into a single LLM.

Code Generation Common Sense Reasoning +6

PsyCoT: Psychological Questionnaire as Powerful Chain-of-Thought for Personality Detection

1 code implementation31 Oct 2023 Tao Yang, Tianyuan Shi, Fanqi Wan, Xiaojun Quan, Qifan Wang, Bingzhe Wu, Jiaxiang Wu

Drawing inspiration from Psychological Questionnaires, which are carefully designed by psychologists to evaluate individual personality traits through a series of targeted items, we argue that these items can be regarded as a collection of well-structured chain-of-thought (CoT) processes.

Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active Exploration

3 code implementations13 Oct 2023 Fanqi Wan, Xinting Huang, Tao Yang, Xiaojun Quan, Wei Bi, Shuming Shi

Instruction-tuning can be substantially optimized through enhanced diversity, resulting in models capable of handling a broader spectrum of tasks.

Diversity

Retrieval-Generation Alignment for End-to-End Task-Oriented Dialogue System

1 code implementation13 Oct 2023 Weizhou Shen, Yingqi Gao, Canbin Huang, Fanqi Wan, Xiaojun Quan, Wei Bi

The results demonstrate that when combined with meta knowledge, the response generator can effectively leverage high-quality knowledge records from the retriever and enhance the quality of generated responses.

Response Generation Retrieval +1

Multi-Grained Knowledge Retrieval for End-to-End Task-Oriented Dialog

1 code implementation17 May 2023 Fanqi Wan, Weizhou Shen, Ke Yang, Xiaojun Quan, Wei Bi

Retrieving proper domain knowledge from an external database lies at the heart of end-to-end task-oriented dialog systems to generate informative responses.

Attribute Response Generation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.