Search Results for author: Yuqiao Wen

Found 7 papers, 4 papers with code

Revisiting Intermediate-Layer Matching in Knowledge Distillation: Layer-Selection Strategy Doesn't Matter (Much)

no code implementations6 Feb 2025 Zony Yu, Yuqiao Wen, Lili Mou

Knowledge distillation (KD) is a popular method of transferring knowledge from a large "teacher" model to a small "student" model.

Knowledge Distillation

Exploring Model Invariance with Discrete Search for Ultra-Low-Bit Quantization

no code implementations6 Feb 2025 Yuqiao Wen, Yanshuai Cao, Lili Mou

Large language models have been increasing in size due to their success in a wide range of applications.

Quantization

Tree-Averaging Algorithms for Ensemble-Based Unsupervised Discontinuous Constituency Parsing

1 code implementation29 Feb 2024 Behzad Shayegh, Yuqiao Wen, Lili Mou

We address unsupervised discontinuous constituency parsing, where we observe a high variance in the performance of the only previous model in the literature.

All Constituency Parsing

EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation

no code implementations29 Feb 2024 Yuqiao Wen, Behzad Shayegh, Chenyang Huang, Yanshuai Cao, Lili Mou

The ability of zero-shot translation emerges when we train a multilingual model with certain translation directions; the model can then directly translate in unseen directions.

Machine Translation Translation +1

f-Divergence Minimization for Sequence-Level Knowledge Distillation

1 code implementation27 Jul 2023 Yuqiao Wen, Zichao Li, Wenyu Du, Lili Mou

Experiments across four datasets show that our methods outperform existing KD approaches, and that our symmetric distilling losses can better force the student to learn from the teacher distribution.

Knowledge Distillation

An Equal-Size Hard EM Algorithm for Diverse Dialogue Generation

2 code implementations29 Sep 2022 Yuqiao Wen, Yongchang Hao, Yanshuai Cao, Lili Mou

Open-domain dialogue systems aim to interact with humans through natural language texts in an open-ended fashion.

Decoder Dialogue Generation

An Empirical Study on the Overlapping Problem of Open-Domain Dialogue Datasets

1 code implementation LREC 2022 Yuqiao Wen, Guoqing Luo, Lili Mou

Open-domain dialogue systems aim to converse with humans through text, and dialogue research has heavily relied on benchmark datasets.

Cannot find the paper you are looking for? You can Submit a new open access paper.