MultiGrain: a unified image embedding for classes and instances

14 Feb 2019  ·  Maxim Berman, Hervé Jégou, Andrea Vedaldi, Iasonas Kokkinos, Matthijs Douze ·

MultiGrain is a network architecture producing compact vector representations that are suited both for image classification and particular object retrieval. It builds on a standard classification trunk. The top of the network produces an embedding containing coarse and fine-grained information, so that images can be recognized based on the object class, particular object, or if they are distorted copies. Our joint training is simple: we minimize a cross-entropy loss for classification and a ranking loss that determines if two images are identical up to data augmentation, with no need for additional labels. A key component of MultiGrain is a pooling layer that takes advantage of high-resolution images with a network trained at a lower resolution. When fed to a linear classifier, the learned embeddings provide state-of-the-art classification accuracy. For instance, we obtain 79.4% top-1 accuracy with a ResNet-50 learned on Imagenet, which is a +1.8% absolute improvement over the AutoAugment method. When compared with the cosine similarity, the same embeddings perform on par with the state-of-the-art for image retrieval at moderate resolutions.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Image Classification ImageNet MultiGrain R50-AA-224 Top 1 Accuracy 78.2% # 544
Top 5 Accuracy 93.9% # 181
Image Classification ImageNet MultiGrain SENet154 (500px) Top 1 Accuracy 82.7% # 301
Image Classification ImageNet MultiGrain PNASNet (500px) Top 1 Accuracy 83.6% # 230
Top 5 Accuracy 96.7% # 57
Image Classification ImageNet MultiGrain PNASNet (300px) Top 1 Accuracy 81.3% # 398
Image Classification ImageNet MultiGrain R50-AA-500 Top 1 Accuracy 79.4% # 474
Top 5 Accuracy 94.8% # 139
Image Classification ImageNet MultiGrain PNASNet (400px) Top 1 Accuracy 82.6% # 307
Image Classification ImageNet MultiGrain SENet154 (450px) Top 1 Accuracy 83.1% # 270
Image Classification ImageNet MultiGrain PNASNet (450px) Top 1 Accuracy 83.2% # 259
Image Classification ImageNet MultiGrain NASNet-A-Mobile (350px) Top 1 Accuracy 75.1% # 632
Top 5 Accuracy 92.5% # 221
Image Classification ImageNet MultiGrain SENet154 (400px) Top 1 Accuracy 83.0% # 277
Top 5 Accuracy 96.5% # 66
Image Retrieval INRIA Holidays MultiGrain R50 @ 500 Mean mAP 91.8% # 2
Image Retrieval INRIA Holidays MultiGrain R50 @ 800 Mean mAP 92.5% # 1

Methods