no code implementations • 27 Aug 2023 • Xiujun Shu, Wei Wen, Liangsheng Xu, Ruizhi Qiao, Taian Guo, Hanjun Li, Bei Gan, Xiao Wang, Xing Sun
In this paper, we present a unified and dynamic graph (UniDG) framework for temporal character grouping.
1 code implementation • ICCV 2023 • Hanjun Li, Xiujun Shu, Sunan He, Ruizhi Qiao, Wei Wen, Taian Guo, Bei Gan, Xing Sun
Under this setup, we propose a Dynamic Gaussian prior based Grounding framework with Glance annotation (D3G), which consists of a Semantic Alignment Group Contrastive Learning module (SA-GCL) and a Dynamic Gaussian prior Adjustment module (DGA).
Ranked #10 on Temporal Sentence Grounding on Charades-STA
1 code implementation • CVPR 2023 • Bei Gan, Xiujun Shu, Ruizhi Qiao, Haoqian Wu, Keyu Chen, Hanjun Li, Bo Ren
Based on existing efforts, this work has two observations: (1) For different annotators, labeling highlight has uncertainty, which leads to inaccurate and time-consuming annotations.
no code implementations • CVPR 2023 • Haoqian Wu, Keyu Chen, Haozhe Liu, Mingchen Zhuge, Bing Li, Ruizhi Qiao, Xiujun Shu, Bei Gan, Liangsheng Xu, Bo Ren, Mengmeng Xu, Wentian Zhang, Raghavendra Ramachandra, Chia-Wen Lin, Bernard Ghanem
Temporal video segmentation is the get-to-go automatic video analysis, which decomposes a long-form video into smaller components for the following-up understanding tasks.
no code implementations • 19 Aug 2022 • Sunan He, Taian Guo, Tao Dai, Ruizhi Qiao, Chen Wu, Xiujun Shu, Bo Ren
Image and language modeling is of crucial importance for vision-language pre-training (VLP), which aims to learn multi-modal representations from large-scale paired image-text data.
1 code implementation • 18 Aug 2022 • Xiujun Shu, Wei Wen, Haoqian Wu, Keyu Chen, Yiran Song, Ruizhi Qiao, Bo Ren, Xiao Wang
To explore the fine-grained alignment, we further propose two implicit semantic alignment paradigms: multi-level alignment (MLA) and bidirectional mask modeling (BMM).
no code implementations • 12 Aug 2022 • Xiujun Shu, Wei Wen, Taian Guo, Sunan He, Chen Wu, Ruizhi Qiao
This technical report presents the 3rd winning solution for MTVG, a new task introduced in the 4-th Person in Context (PIC) Challenge at ACM MM 2022.
no code implementations • 27 Nov 2021 • Xiujun Shu, Yusheng Tao, Ruizhi Qiao, Bo Ke, Wei Wen, Bo Ren
It is by far the largest dataset for person search in media.
1 code implementation • 10 Nov 2021 • Xianghao Zang, Ge Li, Wei Gao, Xiujun Shu
In this way, the complex scenes in the ReID task are effectively disentangled, and the burden of each branch is relieved.
Ranked #2 on Person Re-Identification on P-DukeMTMC-reID
1 code implementation • 9 Nov 2021 • Xianghao Zang, Ge Li, Wei Gao, Xiujun Shu
A local-aware module is employed to explore the poentials of local-level feature for unsupervised learning.
Ranked #1 on Unsupervised Person Re-Identification on PRID2011
Unsupervised Person Re-Identification Video-Based Person Re-Identification
no code implementations • 5 Oct 2021 • Xin Zhang, Xiujun Shu, Bingwen Zhang, Jie Ren, Lizhou Zhou, Xin Chen
Deterministic models, such as ray tracing based on physical laws of wave propagation, are more accurate and site specific.
1 code implementation • 24 Jul 2021 • Xiujun Shu, Ge Li, Xiao Wang, Weijian Ruan, Qi Tian
The key to this task is to exploit cloth-irrelevant cues.
2 code implementations • 22 Jul 2021 • Xiao Wang, Xiujun Shu, Shiliang Zhang, Bo Jiang, YaoWei Wang, Yonghong Tian, Feng Wu
The visible and thermal filters will be used to conduct a dynamic convolutional operation on their corresponding input feature maps respectively.
Ranked #36 on Rgb-T Tracking on RGBT234
2 code implementations • 31 May 2021 • Xiujun Shu, Xiao Wang, Xianghao Zang, Shiliang Zhang, Yuanqi Chen, Ge Li, Qi Tian
We also verified that models pre-trained on LaST can generalize well on existing datasets with short-term and cloth-changing scenarios.
2 code implementations • CVPR 2021 • Xiao Wang, Xiujun Shu, Zhipeng Zhang, Bo Jiang, YaoWei Wang, Yonghong Tian, Feng Wu
We believe this benchmark will greatly boost related researches on natural language guided tracking.
Ranked #6 on Visual Object Tracking on TNL2K (precision metric)