no code implementations • 28 Feb 2024 • Bedionita Soro, Bruno Andreis, Hayeon Lee, Song Chong, Frank Hutter, Sung Ju Hwang
By learning the distribution of a neural network on a variety pretrained models, our approach enables adaptive sampling weights for unseen datasets achieving faster convergence and reaching competitive performance.
no code implementations • 6 Nov 2023 • Hayeon Lee, Rui Hou, Jongpil Kim, Davis Liang, Hongbo Zhang, Sung Ju Hwang, Alexander Min
2) The enhanced performance of the larger model further boosts the performance of the smaller model.
1 code implementation • 26 May 2023 • Hayeon Lee, Sohyun An, Minseon Kim, Sung Ju Hwang
Previous DaNAS methods have mostly tackled the search for the neural architecture for fixed datasets and the teacher, which are not generalized well on a new task consisting of an unseen dataset and an unseen teacher, thus need to perform a costly search for any new combination of the datasets and the teachers.
1 code implementation • 26 May 2023 • Hayeon Lee, Rui Hou, Jongpil Kim, Davis Liang, Sung Ju Hwang, Alexander Min
Distillation from Weak Teacher (DWT) is a method of transferring knowledge from a smaller, weaker teacher model to a larger student model to improve its performance.
1 code implementation • 26 May 2023 • Sohyun An, Hayeon Lee, Jaehyeong Jo, Seanie Lee, Sung Ju Hwang
To tackle such limitations of existing NAS methods, we propose a paradigm shift from NAS to a novel conditional Neural Architecture Generation (NAG) framework based on diffusion models, dubbed DiffusionNAG.
no code implementations • 8 Apr 2022 • Stephen Cha, Taehyeon Kim, Hayeon Lee, Se-Young Yun
The survey analyses supernet optimization methods based on their approaches to spatial and temporal optimization.
1 code implementation • NeurIPS 2021 • Hayeon Lee, Sewoong Lee, Song Chong, Sung Ju Hwang
To overcome such limitations, we propose Hardware-adaptive Efficient Latency Predictor (HELP), which formulates the device-specific latency estimation problem as a meta-learning problem, such that we can estimate the latency of a model's performance for a given task on an unseen device with a few samples.
no code implementations • ICLR 2022 • Hae Beom Lee, Hayeon Lee, Jaewoong Shin, Eunho Yang, Timothy Hospedales, Sung Ju Hwang
Many gradient-based meta-learning methods assume a set of parameters that do not participate in inner-optimization, which can be considered as hyperparameters.
1 code implementation • ICLR 2021 • Hayeon Lee, Eunyoung Hyung, Sung Ju Hwang
Despite the success of recent Neural Architecture Search (NAS) methods on various tasks which have shown to output networks that largely outperform human-designed networks, conventional NAS methods have mostly tackled the optimization of searching for the network architecture for a single task (dataset), which does not generalize well across multiple tasks (datasets).
1 code implementation • 16 Jun 2021 • Hayeon Lee, Sewoong Lee, Song Chong, Sung Ju Hwang
To overcome such limitations, we propose Hardware-adaptive Efficient Latency Predictor (HELP), which formulates the device-specific latency estimation problem as a meta-learning problem, such that we can estimate the latency of a model's performance for a given task on an unseen device with a few samples.
1 code implementation • NeurIPS 2021 • Wonyong Jeong, Hayeon Lee, Gun Park, Eunyoung Hyung, Jinheon Baek, Sung Ju Hwang
To address such limitations, we introduce a novel problem of \emph{Neural Network Search} (NNS), whose goal is to search for the optimal pretrained network for a novel dataset and constraints (e. g. number of parameters), from a model zoo.
no code implementations • 5 Aug 2019 • Hayeon Lee, Donghyun Na, Hae Beom Lee, Sung Ju Hwang
To tackle this issue, we propose a simple yet effective meta-learning framework for metricbased approaches, which we refer to as learning to generalize (L2G), that explicitly constrains the learning on a sampled classification task to reduce the classification error on a randomly sampled unseen classification task with a bilevel optimization scheme.
1 code implementation • ICLR 2020 • Hae Beom Lee, Hayeon Lee, Donghyun Na, Saehoon Kim, Minseop Park, Eunho Yang, Sung Ju Hwang
While tasks could come with varying the number of instances and classes in realistic settings, the existing meta-learning approaches for few-shot classification assume that the number of instances per task and class is fixed.