MixVPR: Feature Mixing for Visual Place Recognition

3 Mar 2023  ยท  Amar Ali-bey, Brahim Chaib-Draa, Philippe Giguรจre ยท

Visual Place Recognition (VPR) is a crucial part of mobile robotics and autonomous driving as well as other computer vision tasks. It refers to the process of identifying a place depicted in a query image using only computer vision. At large scale, repetitive structures, weather and illumination changes pose a real challenge, as appearances can drastically change over time. Along with tackling these challenges, an efficient VPR technique must also be practical in real-world scenarios where latency matters. To address this, we introduce MixVPR, a new holistic feature aggregation technique that takes feature maps from pre-trained backbones as a set of global features. Then, it incorporates a global relationship between elements in each feature map in a cascade of feature mixing, eliminating the need for local or pyramidal aggregation as done in NetVLAD or TransVPR. We demonstrate the effectiveness of our technique through extensive experiments on multiple large-scale benchmarks. Our method outperforms all existing techniques by a large margin while having less than half the number of parameters compared to CosPlace and NetVLAD. We achieve a new all-time high recall@1 score of 94.6% on Pitts250k-test, 88.0% on MapillarySLS, and more importantly, 58.4% on Nordland. Finally, our method outperforms two-stage retrieval techniques such as Patch-NetVLAD, TransVPR and SuperGLUE all while being orders of magnitude faster. Our code and trained models are available at https://github.com/amaralibey/MixVPR.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Visual Place Recognition Gardens Point MixVPR Recall@1 91.50 # 2
Visual Place Recognition Hawkins MixVPR Recall@1 25.42 # 7
Visual Place Recognition Mapillary test MixVPR Recall@1 64.0 # 3
Recall@5 75.9 # 4
Recall@10 80.6 # 4
Visual Place Recognition Mapillary val MixVPR Recall@1 88.2 # 3
Recall@5 93.1 # 3
Recall@10 94.3 # 3
Visual Place Recognition Nardo-Air R MixVPR Recall@1 76.06 # 5
Visual Place Recognition Nordland MixVPR Recall@1 76.0 # 3
Recall@5 89.2 # 3
Recall@10 92.0 # 3
Visual Place Recognition Pittsburgh-250k-test MixVPR Recall@1 94.6 # 3
Recall@5 98.3 # 3
Recall@10 99.0 # 2
Visual Place Recognition Pittsburgh-30k-test MixVPR Recall@1 91.52 # 3
Recall@5 95.9 # 2
Recall@10 96.7 # 2
Visual Place Recognition SPED MixVPR Recall@1 85.2 # 2
Recall@5 92.1 # 2
Recall@10 94.6 # 2
Visual Place Recognition VP-Air MixVPR Recall@1 10.31 # 5

Results from Other Papers


Task Dataset Model Metric Name Metric Value Rank Source Paper Compare
Visual Place Recognition 17 Places MixVPR Recall@1 63.79 # 2
Visual Place Recognition Baidu Mall MixVPR Recall@1 64.44 # 2
Visual Place Recognition Laurel Caverns MixVPR Recall@1 29.46 # 6
Visual Place Recognition Mid-Atlantic Ridge MixVPR Recall@1 25.74 # 3
Visual Place Recognition Nardo-Air MixVPR Recall@1 32.39 # 5
Visual Place Recognition Oxford RobotCar Dataset MixVPR Recall@1 90.05 # 3
Visual Place Recognition St Lucia MixVPR Recall@1 99.66 # 2

Methods


No methods listed for this paper. Add relevant methods here