Specifically, given birds' images with free-text descriptions of their species, we learn to classify images of previously-unseen species based on specie descriptions.
Real-world data is predominantly unbalanced and long-tailed, but deep models struggle to recognize rare classes in the presence of frequent classes.
Ranked #1 on Generalized Few-Shot Learning on SUN
Here we describe a new approach to learn with fewer samples, by using additional information that is provided per sample.
Specifically, our model consists of three classifiers: A "gating" model that makes soft decisions if a sample is from a "seen" class, and two experts: a ZSL expert, and an expert model for seen classes.
The soft group structure can be learned from data jointly as part of the model, and can also readily incorporate prior knowledge about groups if available.
Recurrent neural networks have recently been used for learning to describe images using natural language.