The zero-shot paradigm exploits vector-based word representations extracted from text corpora with unsupervised methods to learn general mapping functions from other feature spaces onto word space, where the words associated to the nearest neighbours of the mapped vectors are used as their linguistic labels.
Given semantic descriptions of object classes, zero-shot learning aims to accurately recognize objects of the unseen classes, from which no examples are available at the training stage, by associating them to the seen classes, from which labeled examples are provided.
In other cases the semantic embedding space is established by an independent natural language processing task, and then the image transformation into that space is learned in a second stage.
Once trained, a RN is able to classify images of new classes by computing relation scores between query images and the few examples of each new class without further updating the network.
We propose prototypical networks for the problem of few-shot classification, where a classifier must generalize to new classes not seen in the training set, given only a small number of examples of each new class.
Graph convolutional neural networks have recently shown great potential for the task of zero-shot learning.
In this paper, we propose a novel style transfer architecture, which can also be extended to generate voices even for target speakers whose data were not used in the training (i. e., case of zero-shot learning).