Towards addressing this problem, we propose an iterative matrix square root normalization method for fast end-to-end training of global covariance pooling networks.
Ranked #4 on Fine-Grained Image Classification on CUB-200-2011
In this paper, we propose a novel "Destruction and Construction Learning" (DCL) method to enhance the difficulty of fine-grained recognition and exercise the classification model to acquire expert knowledge.
Ranked #11 on Fine-Grained Image Classification on FGVC Aircraft
Learning subtle yet discriminative features (e. g., beak and eyes for a bird) plays a significant role in fine-grained image recognition.
Ranked #21 on Fine-Grained Image Classification on CUB-200-2011
The obtained object images not only contain almost the entire structure of the object, but also contains more details, part images have many different scales and more fine-grained features, and the raw images contain the complete object.
Ranked #1 on Fine-Grained Image Classification on FGVC Aircraft
However, the computational cost to learn pairwise interactions between deep feature channels is prohibitively expensive, which restricts this powerful transformation to be used in deep neural networks.
In this work, we investigate simultaneously predicting categories of different levels in the hierarchy and integrating this structured correlation information into the deep neural network by developing a novel Hierarchical Semantic Embedding (HSE) framework.
Ranked #20 on Fine-Grained Image Classification on CUB-200-2011
Fine-grained recognition poses the unique challenge of capturing subtle inter-class differences under considerable intra-class variances (e. g., beaks for bird species).
Ranked #11 on Fine-Grained Image Classification on Stanford Cars
Attention-based learning for fine-grained image recognition remains a challenging task, where most of the existing methods treat each object part in isolation, while neglecting the correlations among them.
Ranked #19 on Fine-Grained Image Classification on Stanford Cars
Attention has been typically implemented in neural networks by selecting the most informative regions of the image that improve classification.