Few-shot Learning with Big Prototypes

29 Sep 2021 · Ning Ding, Yulin Chen, Xiaobin Wang, Hai-Tao Zheng, Zhiyuan Liu, Pengjun Xie ·

Using dense vectors, i.e., prototypes, to represent abstract information of classes has become a common approach in low-data machine learning scenarios. Typically, prototypes are mean output embeddings over the instances for each class. In this case, prototypes have the same dimension of example embeddings, and such tensors could be regarded as ``points'' in the feature space from the geometrical perspective. But these points may lack the expressivity of the whole class-level information due to the biased sampling. In this paper, we propose to use tensor fields (``areas'') to model prototypes to enhance the expressivity of class-level information. Specifically, we present \textit{big prototypes}, where prototypes are represented by hyperspheres with dynamic sizes. A big prototype could be effectively modeled by two sets of learnable parameters, one is the center of the hypersphere, which is an embedding with the same dimension of training examples. The other is the radius of the sphere, which is a constant. Compared with atactic manifolds with complex boundaries, representing hypersphere with parameters is immensely easier. Moreover, it is convenient to perform metric-based classification with big prototypes in few-shot learning, where we only need to calculate the distance from a data point to the surface of the hypersphere. Extensive experiments on few-shot learning tasks across NLP and CV demonstrate the effectiveness of big prototypes.

PDF Abstract