Revisiting Linear Decision Boundaries for Few-Shot Learning with Transformer Hypernetworks

Few-shot learning (FSL) methods aim to generalize a model to new unseen classes using only a small number of support examples. In image classification settings, many FSL approaches utilize a similar architecture to standard supervised learning, learning a model composed of a feature extractor followed by a linear classifier head. A common choice for the classifier is ProtoNet-style nearest neighbor, but this may be suboptimal as it is context-independent. As an alternative, some methods train a parametric classifier (e.g. logistic regression, support vector machine) using embeddings from novel classes. However, task-specific training requires time and resources, and poses optimization challenges such as overfitting on only a few samples. Instead, we propose to generate linear classifiers for new classes using a transformer-based hypernetwork, performing context aggregation in permutation invariant manner. A transformer hypernetwork allows us to instantiate a new task-specific classifier without any additional training on novel tasks. Experiments conducted on 1-shot 5-way and 5-shot 5-way MiniImageNet, TieredImageNet, and CIFAR-FS demonstrate that transformer hypernetworks are capable of generating classifiers that achieve up to 1.4% higher accuracy than other commonly used linear classifiers. Among the group of methods that offer optimization-free meta-inference, we achieve new state-of-the-art in most cases.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods