Self-Knowledge Distillation
16 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in Self-Knowledge Distillation
Most implemented papers
ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks
Keywords: entropy minimisation, maximum entropy, confidence penalty, self knowledge distillation, label correction, label noise, semi-supervised learning, output regularisation
Revisiting Knowledge Distillation via Label Smoothing Regularization
Without any extra computation cost, Tf-KD achieves up to 0. 65\% improvement on ImageNet over well-established baseline models, which is superior to label smoothing regularization.
Regularizing Class-wise Predictions via Self-knowledge Distillation
Deep neural networks with millions of parameters may suffer from poor generalization due to overfitting.
Self-Knowledge Distillation with Progressive Refinement of Targets
Hence, it can be interpreted within a framework of knowledge distillation as a student becomes a teacher itself.
Noisy Self-Knowledge Distillation for Text Summarization
In this paper we apply self-knowledge distillation to text summarization which we argue can alleviate problems with maximum-likelihood training on single reference and noisy datasets.
Even your Teacher Needs Guidance: Ground-Truth Targets Dampen Regularization Imposed by Self-Distillation
Knowledge distillation is classically a procedure where a neural network is trained on the output of another network along with the original targets in order to transfer knowledge between the architectures.
Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation
Knowledge distillation is a method of transferring the knowledge from a pretrained complex teacher model to a student model, so a smaller network can replace a large teacher network at the deployment stage.
Preservation of the Global Knowledge by Not-True Distillation in Federated Learning
In federated learning, a strong global model is collaboratively learned by aggregating clients' locally trained models.
Robust and Accurate Object Detection via Self-Knowledge Distillation
In this paper, we propose Unified Decoupled Feature Alignment (UDFA), a novel fine-tuning paradigm which achieves better performance than existing methods, by fully exploring the combination between self-knowledge distillation and adversarial training for object detection.
Sequential Recommendation with Bidirectional Chronological Augmentation of Transformer
Sequential recommendation can capture user chronological preferences from their historical behaviors, yet the learning of short sequences (cold-start problem) in many benchmark datasets is still an open challenge.