Optimal allocation of data across training tasks in meta-learning

1 Jan 2021 · Georgios Batzolis, Alberto Bernacchia, Da-Shan Shiu, Michael Bromberg, Alexandru Cioba ·

Meta-learning models transfer the knowledge acquired from previous tasks to quickly learn new ones. They are tested on benchmarks with a fixed number of data-points for each training task, and this number is usually arbitrary, for example, 5 instances per class in few-shot classification. It is unknown how the performance of meta-learning is affected by the distribution of data across training tasks. Since labelling of data is expensive, finding the optimal allocation of labels across training tasks may reduce costs. Given a fixed budget b of labels to distribute across tasks, should we use a small number of highly labelled tasks, or many tasks with few labels each? In MAML applied to mixed linear regression, we prove that the optimal number of tasks follows the scaling law sqrt{b}. We develop an online algorithm for data allocation across tasks, and show that the same scaling law applies to nonlinear regression. We also show preliminary experiments on few-shot image classification. Our work provides a theoretical guide for allocating labels across tasks in meta-learning, which we believe will prove useful in a large number of applications.

PDF Abstract