1 code implementation • 14 Dec 2022 • Wenye Lin, Yifeng Ding, Zhixiong Cao, Hai-Tao Zheng
A common practice to address this problem is to introduce a pretrained contrastive teacher model and train the lightweight networks with distillation signals generated by the teacher.
1 code implementation • 9 Mar 2022 • Wenye Lin, Yangming Li, Lemao Liu, Shuming Shi, Hai-Tao Zheng
Specifically, we transfer the knowledge from a teacher model to its student model by locally matching their predictions on all sub-structures, instead of the whole output space.