Self-Knowledge Distillation

24 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition

winycg/self-kd-lib 11 Aug 2022

MixSKD mutually distills feature maps and probability distributions between the random pair of original images and their mixup images in a meaningful way.

Multimodality Multi-Lead ECG Arrhythmia Classification using Self-Supervised Learning

uark-aicv/ecg_ssl_12lead 30 Sep 2022

Electrocardiogram (ECG) signal is one of the most effective sources of information mainly employed for the diagnosis and prediction of cardiovascular diseases (CVDs) connected with the abnormalities in heart rhythm.

Graph-based Knowledge Distillation: A survey and experimental evaluation

liujing1023/graph-based-knowledge-distillation 27 Feb 2023

It then provides a comprehensive summary of three types of Graph-based Knowledge Distillation methods, namely Graph-based Knowledge Distillation for deep neural networks (DKD), Graph-based Knowledge Distillation for GNNs (GKD), and Self-Knowledge Distillation based Graph-based Knowledge Distillation (SKD).

DualFair: Fair Representation Learning at Both Group and Individual Levels via Contrastive Self-supervision

sungwon-han/dualfair 15 Mar 2023

Algorithmic fairness has become an important machine learning problem, especially for mission-critical Web applications.

From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels

yzd-v/cls_KD ICCV 2023

We decompose the KD loss and find the non-target loss from it forces the student's non-target logits to match the teacher's, but the sum of the two non-target logits is different, preventing them from being identical.

Lightweight Self-Knowledge Distillation with Multi-source Information Fusion

xucong-parsifal/lightskd 16 May 2023

Specifically, we introduce a Distillation with Reverse Guidance (DRG) method that considers different levels of information extracted by the model, including edge, shape, and detail of the input data, to construct a more informative teacher.

Incorporating Graph Information in Transformer-based AMR Parsing

sapienzanlp/leakdistill 23 Jun 2023

Abstract Meaning Representation (AMR) is a Semantic Parsing formalism that aims at providing a semantic graph abstraction representing a given text.

Robust Spatiotemporal Traffic Forecasting with Reinforced Dynamic Adversarial Training

usail-hkust/rdat 25 Jun 2023

Therefore, improving the adversarial robustness of these models is crucial for ITS.

Effective Whole-body Pose Estimation with Two-stages Distillation

idea-research/dwpose 29 Jul 2023

Different from the previous self-knowledge distillation, this stage finetunes the student's head with only 20% training time as a plug-and-play training strategy.

Promoting Generalized Cross-lingual Question Answering in Few-resource Scenarios via Self-knowledge Distillation

ccasimiro88/self-distillation-gxlt-qa 29 Sep 2023

Our approach seeks to enhance cross-lingual QA transfer using a high-performing multilingual model trained on a large-scale dataset, complemented by a few thousand aligned QA examples across languages.