no code implementations • 23 Nov 2023 • Seonghak Kim, Gyeongdo Ham, Yucheol Cho, Daeshik Kim
The improvement in the performance of efficient and lightweight models (i. e., the student model) is achieved through knowledge distillation (KD), which involves transferring knowledge from more complex models (i. e., the teacher model).