Gradient-based Hyperparameter Optimization without Validation Data for Learning fom Limited Labels

29 Sep 2021 · Ryuichiro Hataya, Hideki Nakayama ·

Optimizing hyperparameters of machine learning algorithms especially for limited labeled data is important but difficult, because then obtaining enough validation data is practically impossible. Bayesian model selection enables hyperparameter optimization \emph{without validation data}, but it requires Hessian log determinants, which is computationally demanding for deep neural networks. We study methods to efficiently approximate Hessian log determinants and empirically demonstrate that approximated Bayesian model selection can effectively tune hyperparameters of algorithms of deep semi-supervised learning and learning from noisy labels.

PDF Abstract