Learning to Compose Hypercolumns for Visual Correspondence

ECCV 2020  ·  Juhong Min, Jongmin Lee, Jean Ponce, Minsu Cho ·

Feature representation plays a crucial role in visual correspondence, and recent methods for image matching resort to deeply stacked convolutional layers. These models, however, are both monolithic and static in the sense that they typically use a specific level of features, e.g., the output of the last layer, and adhere to it regardless of the images to match. In this work, we introduce a novel approach to visual correspondence that dynamically composes effective features by leveraging relevant layers conditioned on the images to match. Inspired by both multi-layer feature composition in object detection and adaptive inference architectures in classification, the proposed method, dubbed Dynamic Hyperpixel Flow, learns to compose hypercolumn features on the fly by selecting a small number of relevant layers from a deep convolutional neural network. We demonstrate the effectiveness on the task of semantic correspondence, i.e., establishing correspondences between images depicting different instances of the same object or scene category. Experiments on standard benchmarks show that the proposed method greatly improves matching performance over the state of the art in an adaptive and efficient manner.

PDF Abstract ECCV 2020 PDF ECCV 2020 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Semantic correspondence Caltech-101 DHPF LT-ACC 87 # 1
IoU 62 # 2
LT-ACC (weak) 86 # 1
IoU (weak) 61 # 1
Semantic correspondence PF-PASCAL DHPF PCK 90.7 # 9
PCK (weak) 82.1 # 1
Semantic correspondence PF-WILLOW DHPF PCK 77.6 # 7
PCK (weak) 80.2 # 1
Semantic correspondence SPair-71k DHPF PCK 37.3 # 15

Methods


No methods listed for this paper. Add relevant methods here