Learning Neural Processes on the Fly

29 Sep 2021 · Younghwa Jung, Zhenyuan Yuan, Seung-Woo Seo, Minghui Zhu, Seong-Woo Kim ·

Deep neural networks (DNNs) have performed impressively on a wide range of tasks, but they usually require a significant number of training samples to achieve good performance. Thus, DNNs do not work well in low-data regimes because they tend to overfit a small dataset and make poor predictions. In contrast, shallow neural networks (SNNs) generally are robust against overfitting in low-data regimes and converge more quickly than DNNs, but they struggle to represent very complex systems. Hence, DNNs and SNNs have a complementary relationship, and combining their benefits can provide fast-learning capability with high asymptotic performance, as meta-learning does. However, aggregating heterogeneous methods with opposite properties is not trivial, as it can make the combined method inferior to each base method. In this paper, we propose a new algorithm called anytime neural processes that combines DNNs and SNNs and can work in both low-data and high-data regimes. To combine heterogeneous models effectively, we propose a novel aggregation method based on a generalized product-of-exports and a winner-take-all gate network. Moreover, we discuss the theoretical basis of the proposed method. Experiments on a public dataset show that the proposed method achieves comparable performance with other state-of-the-art methods.

PDF Abstract