1 code implementation • 20 Mar 2021 • Aviad Moreshet, Yosi Keller
We propose an attention-based approach for multimodal image patch matching using a Transformer encoder attending to the feature maps of a multiscale Siamese CNN.
Ranked #1 on Multimodal Patch Matching on VisNir