Search Results for author: Nakyil Kim

Found 3 papers, 1 papers with code

Comparing Kullback-Leibler Divergence and Mean Squared Error Loss in Knowledge Distillation

1 code implementation19 May 2021 Taehyeon Kim, Jaehoon Oh, Nakyil Kim, Sangwook Cho, Se-Young Yun

From this observation, we consider an intuitive KD loss function, the mean squared error (MSE) between the logit vectors, so that the student model can directly learn the logit of the teacher model.

Knowledge Distillation Learning with noisy labels

Understanding Knowledge Distillation

no code implementations1 Jan 2021 Taehyeon Kim, Jaehoon Oh, Nakyil Kim, Sangwook Cho, Se-Young Yun

To verify this conjecture, we test an extreme logit learning model, where the KD is implemented with Mean Squared Error (MSE) between the student's logit and the teacher's logit.

Knowledge Distillation

Adaptive Local Bayesian Optimization Over Multiple Discrete Variables

no code implementations7 Dec 2020 Taehyeon Kim, Jaeyeon Ahn, Nakyil Kim, Seyoung Yun

In the machine learning algorithms, the choice of the hyperparameter is often an art more than a science, requiring labor-intensive search with expert experience.

Bayesian Optimization BIG-bench Machine Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.