TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Visual Place Recognition	Mapillary test	DINOv2 SALAD	Recall@1	75	# 1
Visual Place Recognition	Mapillary test	DINOv2 SALAD	Recall@5	88.8	# 1
Visual Place Recognition	Mapillary test	DINOv2 SALAD	Recall@10	91.3	# 1
Visual Place Recognition	Mapillary val	DINOv2 SALAD	Recall@1	92.2	# 1
Visual Place Recognition	Mapillary val	DINOv2 SALAD	Recall@5	96.4	# 1
Visual Place Recognition	Mapillary val	DINOv2 SALAD	Recall@10	97	# 2
Visual Place Recognition	Nordland	DINOv2 SALAD (1-frame threshold)	Recall@1	85.2	# 2
Visual Place Recognition	Nordland	DINOv2 SALAD (1-frame threshold)	Recall@5	98.5	# 1
Visual Place Recognition	Nordland	DINOv2 SALAD (1-frame threshold)	Recall@10	95.5	# 2
Visual Place Recognition	Pittsburgh-250k-test	DINOv2 SALAD	Recall@1	95.1	# 2
Visual Place Recognition	Pittsburgh-250k-test	DINOv2 SALAD	Recall@5	98.5	# 2
Visual Place Recognition	Pittsburgh-250k-test	DINOv2 SALAD	Recall@10	99.1	# 1
Visual Place Recognition	SPED	DINOv2 SALAD	Recall@1	92.1	# 1
Visual Place Recognition	SPED	DINOv2 SALAD	Recall@5	96.2	# 1
Visual Place Recognition	SPED	DINOv2 SALAD	Recall@10	96.5	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/optimal-transport-aggregation-for-visual/visual-place-recognition-on-mapillary-test)](https://paperswithcode.com/sota/visual-place-recognition-on-mapillary-test?p=optimal-transport-aggregation-for-visual)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/optimal-transport-aggregation-for-visual/visual-place-recognition-on-mapillary-val)](https://paperswithcode.com/sota/visual-place-recognition-on-mapillary-val?p=optimal-transport-aggregation-for-visual)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/optimal-transport-aggregation-for-visual/visual-place-recognition-on-sped)](https://paperswithcode.com/sota/visual-place-recognition-on-sped?p=optimal-transport-aggregation-for-visual)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/optimal-transport-aggregation-for-visual/visual-place-recognition-on-nordland)](https://paperswithcode.com/sota/visual-place-recognition-on-nordland?p=optimal-transport-aggregation-for-visual)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/optimal-transport-aggregation-for-visual/visual-place-recognition-on-pittsburgh-250k)](https://paperswithcode.com/sota/visual-place-recognition-on-pittsburgh-250k?p=optimal-transport-aggregation-for-visual)`

Optimal Transport Aggregation for Visual Place Recognition

27 Nov 2023 · Sergio Izquierdo, Javier Civera ·

The task of Visual Place Recognition (VPR) aims to match a query image against references from an extensive database of images from different places, relying solely on visual cues. State-of-the-art pipelines focus on the aggregation of features extracted from a deep backbone, in order to form a global descriptor for each image. In this context, we introduce SALAD (Sinkhorn Algorithm for Locally Aggregated Descriptors), which reformulates NetVLAD's soft-assignment of local features to clusters as an optimal transport problem. In SALAD, we consider both feature-to-cluster and cluster-to-feature relations and we also introduce a 'dustbin' cluster, designed to selectively discard features deemed non-informative, enhancing the overall descriptor quality. Additionally, we leverage and fine-tune DINOv2 as a backbone, which provides enhanced description power for the local features, and dramatically reduces the required training time. As a result, our single-stage method not only surpasses single-stage baselines in public VPR datasets, but also surpasses two-stage methods that add a re-ranking with significantly higher cost. Code and models are available at https://github.com/serizba/salad.

PDF Abstract

Code

Add Remove Mark official

serizba/salad official

104

Tasks

Add Remove

Re-Ranking

Visual Place Recognition

Datasets

Mapillary Vistas Dataset MSLS

Nordland

Results from the Paper

Edit

Ranked #1 on Visual Place Recognition on SPED

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Visual Place Recognition	Mapillary test	DINOv2 SALAD	Recall@1	75	# 1	Compare
			Recall@5	88.8	# 1	Compare
			Recall@10	91.3	# 1	Compare
Visual Place Recognition	Mapillary val	DINOv2 SALAD	Recall@1	92.2	# 1	Compare
			Recall@5	96.4	# 1	Compare
			Recall@10	97	# 2	Compare
Visual Place Recognition	Nordland	DINOv2 SALAD (1-frame threshold)	Recall@1	85.2	# 2	Compare
			Recall@5	98.5	# 1	Compare
			Recall@10	95.5	# 2	Compare
Visual Place Recognition	Pittsburgh-250k-test	DINOv2 SALAD	Recall@1	95.1	# 2	Compare
			Recall@5	98.5	# 2	Compare
			Recall@10	99.1	# 1	Compare
Visual Place Recognition	SPED	DINOv2 SALAD	Recall@1	92.1	# 1	Compare
			Recall@5	96.2	# 1	Compare
			Recall@10	96.5	# 1	Compare

Methods

Add Remove

Focus

Edit Social Preview

Optimal Transport Aggregation for Visual Place Recognition

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove