Reverse Knowledge Distillation: Training a Large Model using a Small One for Retinal Image Matching on Limited Data

20 Jul 2023  ·  Sahar Almahfouz Nasser, Nihar Gupte, Amit Sethi ·

Retinal image matching plays a crucial role in monitoring disease progression and treatment response. However, datasets with matched keypoints between temporally separated pairs of images are not available in abundance to train transformer-based model. We propose a novel approach based on reverse knowledge distillation to train large models with limited data while preventing overfitting. Firstly, we propose architectural modifications to a CNN-based semi-supervised method called SuperRetina that help us improve its results on a publicly available dataset. Then, we train a computationally heavier model based on a vision transformer encoder using the lighter CNN-based model, which is counter-intuitive in the field knowledge-distillation research where training lighter models based on heavier ones is the norm. Surprisingly, such reverse knowledge distillation improves generalization even further. Our experiments suggest that high-dimensional fitting in representation space may prevent overfitting unlike training directly to match the final output. We also provide a public dataset with annotations for retinal image keypoint detection and matching to help the research community develop algorithms for retinal image applications.

PDF Abstract

Datasets


Introduced in the Paper:

MeDAL Retina Dataset

Used in the Paper:

FIRE

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Image Registration FIRE LKRetina mAUC 0.761 # 1

Methods