Learning Multi-Attention Convolutional Neural Network for Fine-Grained Image Recognition

ICCV 2017  ·  Heliang Zheng, Jianlong Fu, Tao Mei, Jiebo Luo ·

Recognizing fine-grained categories (e.g., bird species) highly relies on discriminative part localization and part-based fine-grained feature learning. Existing approaches predominantly solve these challenges independently, while neglecting the fact that part localization (e.g., head of a bird) and fine-grained feature learning (e.g., head shape) are mutually correlated. In this paper, we propose a novel part learning approach by a multi-attention convolutional neural network (MA-CNN), where part generation and feature learning can reinforce each other. MA-CNN consists of convolution, channel grouping and part classification sub-networks. The channel grouping network takes as input feature channels from convolutional layers, and generates multiple parts by clustering, weighting and pooling from spatially-correlated channels. The part classification network further classifies an image by each individual part, through which more discriminative fine-grained features can be learned. Two losses are proposed to guide the multi-task learning of channel grouping and part classification, which encourages MA-CNN to generate more discriminative parts from feature channels and learn better fine-grained features from parts in a mutual reinforced way. MA-CNN does not need bounding box/part annotation and can be trained end-to-end. We incorporate the learned parts from MA-CNN with part-CNN for recognition, and show the best performances on three challenging published fine-grained datasets, e.g., CUB-Birds, FGVC-Aircraft and Stanford-Cars.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Fine-Grained Image Classification CUB-200-2011 MACNN Accuracy 86.5 # 22
Fine-Grained Image Classification FGVC Aircraft MACNN Accuracy 89.9 # 45
Fine-Grained Image Classification Stanford Cars MACNN Accuracy 92.8 # 61

Methods


No methods listed for this paper. Add relevant methods here