no code implementations • 27 Oct 2021 • Gaurav Kumar Nayak, Monish Keswani, Sharan Seshadri, Anirban Chakraborty
Knowledge Distillation (KD) utilizes training data as a transfer set to transfer knowledge from a complex network (Teacher) to a smaller network (Student).