no code implementations • 22 Sep 2023 • Abhishek Singh Sambyal, Usma Niyaz, Narayanan C. Krishnan, Deepti R. Bathula
We considered fully supervised training, which is the prevailing approach in the community, as well as rotation-based self-supervised method with and without transfer learning, across various datasets and architecture sizes.
no code implementations • 6 Dec 2022 • Usma Niyaz, Abhishek Singh Sambyal, Deepti R. Bathula
These experimental results demonstrate that knowledge diversification in a combined KD and ML framework outperforms conventional KD or ML techniques (with similar network configuration) that only use predictions with an average improvement of 2%.
no code implementations • 21 Oct 2021 • Usma Niyaz, Deepti R. Bathula
Knowledge distillation (KD) is an effective model compression technique where a compact student network is taught to mimic the behavior of a complex and highly trained teacher network.