no code implementations • ICCV 2023 • Chull Hwan Song, Taebaek Hwang, Jooyoung Yoon, Shunghyun Choi, Yeong Hyeon Gu
Many studies in vision tasks have aimed to create effective embedding spaces for single-label object prediction within an image.
1 code implementation • 21 Oct 2022 • Chull Hwan Song, Jooyoung Yoon, Shunghyun Choi, Yannis Avrithis
(4) We enhance locality of interactions at the deeper layers of the encoder, which is the relative weakness of vision transformers.
no code implementations • 16 Jul 2021 • Chull Hwan Song, Hye Joo Han, Yannis Avrithis
Apart from backbone, training pipelines and loss functions, popular approaches have focused on different spatial pooling and attention mechanisms, which are at the core of learning a powerful global image representation.