Search Results for author: Wenwen Gong

Found 3 papers, 3 papers with code

GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model

1 code implementation11 Jun 2023 Shicheng Tan, Weng Lam Tam, Yuanchun Wang, Wenwen Gong, Yang Yang, Hongyin Tang, Keqing He, Jiahao Liu, Jingang Wang, Shu Zhao, Peng Zhang, Jie Tang

Currently, the reduction in the parameter scale of large-scale pre-trained language models (PLMs) through knowledge distillation has greatly facilitated their widespread deployment on various devices.

General Knowledge Knowledge Distillation +1

Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method

1 code implementation11 Jun 2023 Shicheng Tan, Weng Lam Tam, Yuanchun Wang, Wenwen Gong, Shu Zhao, Peng Zhang, Jie Tang

To address these problems, we propose a general language model distillation (GLMD) method that performs two-stage word prediction distillation and vocabulary compression, which is simple and surprisingly shows extremely strong performance.

Knowledge Distillation Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.