Search Results for author: Zha Sheng

Found 1 papers, 0 papers with code

Distiller: A Systematic Study of Model Distillation Methods in Natural Language Processing

no code implementations • EMNLP (sustainlp) 2021 • Haoyu He, Xingjian Shi, Jonas Mueller, Zha Sheng, Mu Li, George Karypis

We aim to identify how different components in the KD pipeline affect the resulting performance and how much the optimal KD pipeline varies across different datasets/tasks, such as the data augmentation policy, the loss function, and the intermediate representation for transferring the knowledge between teacher and student.

Data Augmentation Hyperparameter Optimization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.