Learning where to learn

ICLR Workshop Learning_to_Learn 2021 · Dominic Zhao, Nicolas Zucchet, Joao Sacramento, Johannes von Oswald ·

Finding neural network weights that generalize well from small datasets is difficult. A promising approach is to (meta-)learn a weight initialization from a collection of tasks, such that a small number of weight changes results in low generalization error. We show that this form of meta-learning can be improved by letting the learning algorithm decide which weights to change, i.e., by learning where to learn. We find that patterned sparsity emerges from this process. Lower-level features tend to be frozen, while weights close to the output remain plastic. This selective sparsity enables running longer sequences of weight updates with-out overfitting, resulting in better generalization in the miniImageNet benchmark. Our findings shed light on an ongoing debate on whether meta-learning can discover adaptable features, and suggest that sparse learning can outperform simpler feature reuse schemes.

PDF Abstract