1 code implementation • 29 Jan 2024 • Qingpei Guo, Furong Xu, Hanxiao Zhang, Wang Ren, Ziping Ma, Lin Ju, Jian Wang, Jingdong Chen, Ming Yang
Vision-language foundation models like CLIP have revolutionized the field of artificial intelligence.
Ranked #1 on Zero-shot Image Retrieval on Flickr30k-CN (using extra training data)
Zero-Shot Cross-Modal Retrieval Zero-shot Image Retrieval +3
no code implementations • 4 Jan 2024 • Ziping Ma, Furong Xu, Jian Liu, Ming Yang, Qingpei Guo
To achieve multimodal alignment from both global and local perspectives, this paper proposes Symmetrizing Contrastive Captioners (SyCoCa), which introduces bidirectional interactions on images and texts across the global and local representation levels.
no code implementations • 7 Dec 2023 • Xuelin Zhu, Jiuxin Cao, Jian Liu, Dongqi Tang, Furong Xu, Weijia Liu, Jiawei Ge, Bo Liu, Qingpei Guo, Tianyi Zhang
Pre-trained vision-language models have notably accelerated progress of open-world concept recognition.
no code implementations • 20 Sep 2023 • Chen Jiang, Kaiming Huang, Sifeng He, Xudong Yang, Wei zhang, Xiaobo Zhang, Yuan Cheng, Lei Yang, Qing Wang, Furong Xu, Tan Pan, Wei Chu
SSAN is based on two newly proposed modules in video retrieval: (1) An efficient Self-supervised Keyframe Extraction (SKE) module to reduce redundant frame features, (2) A robust Similarity Pattern Detection (SPD) module for temporal alignment.
1 code implementation • CVPR 2023 • Tan Pan, Furong Xu, Xudong Yang, Sifeng He, Chen Jiang, Qingpei Guo, Feng Qian Xiaobo Zhang, Yuan Cheng, Lei Yang, Wei Chu
For traditional model upgrades, the old model will not be replaced by the new one until the embeddings of all the images in the database are re-computed by the new model, which takes days or weeks for a large amount of data.
1 code implementation • 28 Feb 2023 • Wen Li, Cheng Zou, Meng Wang, Furong Xu, Jianan Zhao, Ruobing Zheng, Yuan Cheng, Wei Chu
In this paper, we propose a Diverse and Compact Transformer (DC-Former) that can achieve a similar effect by splitting embedding space into multiple diverse and compact subspaces.
no code implementations • 4 Jul 2022 • Cheng Zou, Furong Xu, Meng Wang, Wen Li, Yuan Cheng
Automatic snake species recognition is important because it has vast potential to help lower deaths and disabilities caused by snakebites.
1 code implementation • CVPR 2022 • Sifeng He, Xudong Yang, Chen Jiang, Gang Liang, Wei zhang, Tan Pan, Qing Wang, Furong Xu, Chunguang Li, Jingxiong Liu, Hui Xu, Kaiming Huang, Yuan Cheng, Feng Qian, Xiaobo Zhang, Lei Yang
In this paper, we introduce VCSL (Video Copy Segment Localization), a new comprehensive segment-level annotated video copy dataset.
no code implementations • 9 Dec 2021 • Wen Li, Furong Xu, Jianan Zhao, Ruobing Zheng, Cheng Zou, Meng Wang, Yuan Cheng
Triplet loss is a widely adopted loss function in ReID task which pulls the hardest positive pairs close and pushes the hardest negative pairs far away.
no code implementations • CVPR 2021 • Furong Xu, Meng Wang, Wei zhang, Yuan Cheng, Wei Chu
Therefore, there is a need for a training mechanism that enforces the discriminativeness of all the elements in the feature to capture more the subtle visual cues.