SVDNet for Pedestrian Retrieval

This paper proposes the SVDNet for retrieval problems, with focus on the application of person re-identification (re-ID). We view each weight vector within a fully connected (FC) layer in a convolutional neuron network (CNN) as a projection basis. It is observed that the weight vectors are usually highly correlated. This problem leads to correlations among entries of the FC descriptor, and compromises the retrieval performance based on the Euclidean distance. To address the problem, this paper proposes to optimize the deep representation learning process with Singular Vector Decomposition (SVD). Specifically, with the restraint and relaxation iteration (RRI) training scheme, we are able to iteratively integrate the orthogonality constraint in CNN training, yielding the so-called SVDNet. We conduct experiments on the Market-1501, CUHK03, and Duke datasets, and show that RRI effectively reduces the correlation among the projection vectors, produces more discriminative FC descriptors, and significantly improves the re-ID accuracy. On the Market-1501 dataset, for instance, rank-1 accuracy is improved from 55.3% to 80.5% for CaffeNet, and from 73.8% to 82.3% for ResNet-50.

PDF Abstract ICCV 2017 PDF ICCV 2017 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Person Re-Identification CUHK03 detected SVDNet-CaffeNet MAP 24.9 # 16
Rank-1 27.7 # 16
Person Re-Identification CUHK03 detected SVDNet-ResNet50 MAP 37.3 # 14
Rank-1 41.5 # 14
Person Re-Identification DukeMTMC-reID SVDNet Rank-1 76.7 # 67
mAP 56.8 # 72
Person Re-Identification Market-1501 SVDNet Rank-1 82.3 # 97
mAP 62.1 # 107

Methods