Search Results for author: Zepeng Lin

Found 1 papers, 0 papers with code

Exploring and Enhancing the Transfer of Distribution in Knowledge Distillation for Autoregressive Language Models

no code implementations19 Sep 2024 Jun Rao, Xuebo Liu, Zepeng Lin, Liang Ding, Jing Li, DaCheng Tao, Min Zhang

Knowledge distillation (KD) is a technique that compresses large teacher models by training smaller student models to mimic them.

Knowledge Distillation

Cannot find the paper you are looking for? You can Submit a new open access paper.