Most teacher-student frameworks based on knowledge distillation (KD) depend
on a strong congruent constraint on instance level. However, they usually
ignore the correlation between multiple instances, which is also valuable for
knowledge transfer. In this work, we propose a new framework named correlation
congruence for knowledge distillation (CCKD), which transfers not only the
instance-level information, but also the correlation between instances.
Furthermore, a generalized kernel method based on Taylor series expansion is
proposed to better capture the correlation between instances. Empirical
experiments and ablation studies on image classification tasks (including
CIFAR-100, ImageNet-1K) and metric learning tasks (including ReID and Face
Recognition) show that the proposed CCKD substantially outperforms the original
KD and achieves state-of-the-art accuracy compared with other SOTA KD-based
methods. The CCKD can be easily deployed in the majority of the teacher-student
framework such as KD and hint-based learning methods.