Search Results for author: Xiaofeng Mou

Found 6 papers, 3 papers with code

EPSD: Early Pruning with Self-Distillation for Efficient Model Compression

no code implementations31 Jan 2024 Dong Chen, Ning Liu, Yichen Zhu, Zhengping Che, Rui Ma, Fachao Zhang, Xiaofeng Mou, Yi Chang, Jian Tang

Instead of a simple combination of pruning and SD, EPSD enables the pruned network to favor SD by keeping more distillable weights before training to ensure better distillation of the pruned network.

Knowledge Distillation Network Pruning +1

LLaVA-Phi: Efficient Multi-Modal Assistant with Small Language Model

1 code implementation4 Jan 2024 Yichen Zhu, Minjie Zhu, Ning Liu, Zhicai Ou, Xiaofeng Mou, Jian Tang

In this paper, we introduce LLaVA-$\phi$ (LLaVA-Phi), an efficient multi-modal assistant that harnesses the power of the recently advanced small language model, Phi-2, to facilitate multi-modal dialogues.

Language Modelling Visual Question Answering

ScaleKD: Distilling Scale-Aware Knowledge in Small Object Detector

no code implementations CVPR 2023 Yichen Zhu, Qiqi Zhou, Ning Liu, Zhiyuan Xu, Zhicai Ou, Xiaofeng Mou, Jian Tang

Unlike existing works that struggle to balance the trade-off between inference speed and SOD performance, in this paper, we propose a novel Scale-aware Knowledge Distillation (ScaleKD), which transfers knowledge of a complex teacher model to a compact student model.

Knowledge Distillation object-detection +2

Is MultiWOZ a Solved Task? An Interactive TOD Evaluation Framework with User Simulator

1 code implementation26 Oct 2022 Qinyuan Cheng, Linyang Li, Guofeng Quan, Feng Gao, Xiaofeng Mou, Xipeng Qiu

Besides, we introduce a sentence-level and a session-level score to measure the sentence fluency and session coherence in the interactive evaluation.

Sentence

Few Clean Instances Help Denoising Distant Supervision

1 code implementation COLING 2022 Yufang Liu, Ziyin Huang, Yijun Wang, Changzhi Sun, Man Lan, Yuanbin Wu, Xiaofeng Mou, Ding Wang

Existing distantly supervised relation extractors usually rely on noisy data for both model training and evaluation, which may lead to garbage-in-garbage-out systems.

Denoising

Cannot find the paper you are looking for? You can Submit a new open access paper.