no code implementations • 3 Dec 2018 • Wei-Chun Chen, Chia-Che Chang, Chien-Yu Lu, Che-Rung Lee
One promising method is knowledge distillation (KD), which creates a fast-to-execute student model to mimic a large teacher network.
Classification General Classification +3