1 code implementation • 17 Apr 2024 • Sherry X. Chen, Yaron Vaxman, Elad Ben Baruch, David Asulin, Aviad Moreshet, Kuo-Chin Lien, Misha Sra, Pradeep Sen
Previous approaches have focused on either fine-tuning pre-trained T2I models on specific datasets to generate certain kinds of images (e. g., with a specific object or person), or on optimizing the weights, text prompts, and/or learning features for each input image in an attempt to coax the image generator to produce the desired result.
1 code implementation • 20 Mar 2021 • Aviad Moreshet, Yosi Keller
We propose an attention-based approach for multimodal image patch matching using a Transformer encoder attending to the feature maps of a multiscale Siamese CNN.
Ranked #1 on Multimodal Patch Matching on VisNir