Self-Knowledge Distillation
24 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in Self-Knowledge Distillation
Most implemented papers
MixSKD: Self-Knowledge Distillation from Mixup for Image Recognition
MixSKD mutually distills feature maps and probability distributions between the random pair of original images and their mixup images in a meaningful way.
Multimodality Multi-Lead ECG Arrhythmia Classification using Self-Supervised Learning
Electrocardiogram (ECG) signal is one of the most effective sources of information mainly employed for the diagnosis and prediction of cardiovascular diseases (CVDs) connected with the abnormalities in heart rhythm.
Graph-based Knowledge Distillation: A survey and experimental evaluation
It then provides a comprehensive summary of three types of Graph-based Knowledge Distillation methods, namely Graph-based Knowledge Distillation for deep neural networks (DKD), Graph-based Knowledge Distillation for GNNs (GKD), and Self-Knowledge Distillation based Graph-based Knowledge Distillation (SKD).
DualFair: Fair Representation Learning at Both Group and Individual Levels via Contrastive Self-supervision
Algorithmic fairness has become an important machine learning problem, especially for mission-critical Web applications.
From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels
We decompose the KD loss and find the non-target loss from it forces the student's non-target logits to match the teacher's, but the sum of the two non-target logits is different, preventing them from being identical.
Lightweight Self-Knowledge Distillation with Multi-source Information Fusion
Specifically, we introduce a Distillation with Reverse Guidance (DRG) method that considers different levels of information extracted by the model, including edge, shape, and detail of the input data, to construct a more informative teacher.
Incorporating Graph Information in Transformer-based AMR Parsing
Abstract Meaning Representation (AMR) is a Semantic Parsing formalism that aims at providing a semantic graph abstraction representing a given text.
Robust Spatiotemporal Traffic Forecasting with Reinforced Dynamic Adversarial Training
Therefore, improving the adversarial robustness of these models is crucial for ITS.
Effective Whole-body Pose Estimation with Two-stages Distillation
Different from the previous self-knowledge distillation, this stage finetunes the student's head with only 20% training time as a plug-and-play training strategy.
Promoting Generalized Cross-lingual Question Answering in Few-resource Scenarios via Self-knowledge Distillation
Our approach seeks to enhance cross-lingual QA transfer using a high-performing multilingual model trained on a large-scale dataset, complemented by a few thousand aligned QA examples across languages.