GSV-Cities: Toward Appropriate Supervised Visual Place Recognition

19 Oct 2022  ·  Amar Ali-bey, Brahim Chaib-Draa, Philippe Giguère ·

This paper aims to investigate representation learning for large scale visual place recognition, which consists of determining the location depicted in a query image by referring to a database of reference images. This is a challenging task due to the large-scale environmental changes that can occur over time (i.e., weather, illumination, season, traffic, occlusion). Progress is currently challenged by the lack of large databases with accurate ground truth. To address this challenge, we introduce GSV-Cities, a new image dataset providing the widest geographic coverage to date with highly accurate ground truth, covering more than 40 cities across all continents over a 14-year period. We subsequently explore the full potential of recent advances in deep metric learning to train networks specifically for place recognition, and evaluate how different loss functions influence performance. In addition, we show that performance of existing methods substantially improves when trained on GSV-Cities. Finally, we introduce a new fully convolutional aggregation layer that outperforms existing techniques, including GeM, NetVLAD and CosPlace, and establish a new state-of-the-art on large-scale benchmarks, such as Pittsburgh, Mapillary-SLS, SPED and Nordland. The dataset and code are available for research purposes at

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Visual Place Recognition Mapillary val Conv-AP Recall@1 83.4 # 2
Recall@5 90.5 # 2
Recall@10 92.3 # 2
Visual Place Recognition Nordland Conv-AP Recall@1 38.2 # 2
Recall@5 54.8 # 1
Recall@10 61.2 # 1
Visual Place Recognition Pittsburgh-250k-test Conv-AP Recall@1 92.4 # 1
Recall@5 97.6 # 1
Recall@10 98.6 # 1


No methods listed for this paper. Add relevant methods here