Contrastively-reinforced Attention Convolutional Neural Network for Fine-grained Image Recognition

BMVC 2020  ·  Dichao Liu, Yu Wang, Jien Kato, Kenji Mase ·

Fine-grained visual classification is inherently challenging because of its inter-class similarity and intra-class variance. However, by contrasting the images with same/different labels, a human can instinctively notice that the key clues lie in certain objects while other objects are ignorable. Inspired by this, we propose Contrastively-reinforced Attention Convolutional Neural Network (CRA-CNN), which reinforces the attention awareness of deep activations. CRA-CNN mainly contains two parts: the classification stream and attention regularization stream. The former classifies the input image and simultaneously divides the visual information of the input into attention and redundancy. The latter evaluates the attention/redundancy proposal by classifying the attention and contrasting the attention/redundancy of various inputs. The evaluation information is backpropagated and forces the classification stream to improve its awareness of visual attention, which helps classification. Experimental results on CUB-Birds and Stanford Cars show that CRA-CNN distinctly outperforms the baselines and is comparable with state-of-art studies despite its simplicity.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Fine-Grained Image Classification CUB-200-2011 CRA-CNN Accuracy 88.3% # 46
Fine-Grained Image Classification Stanford Cars CRA-CNN Accuracy 94.8% # 25

Methods


No methods listed for this paper. Add relevant methods here