no code implementations • 18 Feb 2024 • Longhuang Wu, Shangxuan Tian, Youxin Wang, Pengfei Xiong
Existing methods for scene text detection can be divided into two paradigms: segmentation-based and anchor-based.
no code implementations • 1 Aug 2023 • Bolun Cai, Pengfei Xiong, Shangxuan Tian
In this paper, we propose a novel metric learning function called Center Contrastive Loss, which maintains a class-wise center bank and compares the category centers with the query data points using a contrastive loss.
Ranked #5 on
Metric Learning
on CUB-200-2011
4 code implementations • CVPR 2023 • Peng Jin, Jinfa Huang, Pengfei Xiong, Shangxuan Tian, Chang Liu, Xiangyang Ji, Li Yuan, Jie Chen
Contrastive learning-based video-language representation learning approaches, e. g., CLIP, have achieved outstanding performance, which pursue semantic interaction upon pre-defined video-text pairs.
Ranked #8 on
Video Question Answering
on MSRVTT-QA
no code implementations • 16 Aug 2021 • Qinghong Lin, Xiaojun Chen, Qin Zhang, Shangxuan Tian, Yudong Chen
Secondly, we measure the priorities of data pairs with PIC and assign adaptive weights to them, which is relies on the assumption that more dissimilar data pairs contain more discriminative information for hash learning.
no code implementations • 25 Nov 2018 • Dinh NguyenVan, Shijian Lu, Shangxuan Tian, Nizar Ouarti, Mounir Mokhtari
Automatic reading texts in scenes has attracted increasing interest in recent years as texts often carry rich semantic information that is useful for scene understanding.
no code implementations • ICCV 2017 • Shangxuan Tian, Shijian Lu, Chongshou Li
With a "light" supervised model trained on a small fully annotated dataset, we explore semi-supervised and weakly supervised learning on a large unannotated dataset and a large weakly annotated dataset, respectively.
no code implementations • ICCV 2015 • Shangxuan Tian, Yifeng Pan, Chang Huang, Shijian Lu, Kai Yu, Chew Lim Tan
With character candidates detected by cascade boosting, the min-cost flow network model integrates the last three sequential steps into a single process which solves the error accumulation problem at both character level and text line level effectively.