Self-Knowledge Distillation

35 papers with code • 0 benchmarks • 0 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

ProSelfLC: Progressive Self Label Correction for Training Robust Deep Neural Networks

XinshaoAmosWang/ProSelfLC-CVPR2021 CVPR 2021

Keywords: entropy minimisation, maximum entropy, confidence penalty, self knowledge distillation, label correction, label noise, semi-supervised learning, output regularisation

BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation

flagopen/flagembedding 5 Feb 2024

It can simultaneously perform the three common retrieval functionalities of embedding model: dense retrieval, multi-vector retrieval, and sparse retrieval, which provides a unified model foundation for real-world IR applications.

Preservation of the Global Knowledge by Not-True Distillation in Federated Learning

Lee-Gihun/FedNTD 6 Jun 2021

In federated learning, a strong global model is collaboratively learned by aggregating clients' locally trained models.

Revisiting Knowledge Distillation via Label Smoothing Regularization

yuanli2333/Teacher-free-Knowledge-Distillation CVPR 2020

Without any extra computation cost, Tf-KD achieves up to 0. 65\% improvement on ImageNet over well-established baseline models, which is superior to label smoothing regularization.

FedSOL: Stabilized Orthogonal Learning with Proximal Restrictions in Federated Learning

Lee-Gihun/FedSOL CVPR 2024

FedSOL is designed to identify gradients of local objectives that are inherently orthogonal to directions affecting the proximal objective.

Regularizing Class-wise Predictions via Self-knowledge Distillation

alinlab/cs-kd CVPR 2020

Deep neural networks with millions of parameters may suffer from poor generalization due to overfitting.

Self-Knowledge Distillation with Progressive Refinement of Targets

lgcnsai/ps-kd-pytorch ICCV 2021

Hence, it can be interpreted within a framework of knowledge distillation as a student becomes a teacher itself.

Noisy Self-Knowledge Distillation for Text Summarization

nlpyang/NoisySumm NAACL 2021

In this paper we apply self-knowledge distillation to text summarization which we argue can alleviate problems with maximum-likelihood training on single reference and noisy datasets.

Even your Teacher Needs Guidance: Ground-Truth Targets Dampen Regularization Imposed by Self-Distillation

Kennethborup/self_distillation NeurIPS 2021

Knowledge distillation is classically a procedure where a neural network is trained on the output of another network along with the original targets in order to transfer knowledge between the architectures.

Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation

MingiJi/FRSKD CVPR 2021

Knowledge distillation is a method of transferring the knowledge from a pretrained complex teacher model to a student model, so a smaller network can replace a large teacher network at the deployment stage.