no code implementations • 12 Jan 2024 • Bowen Shi, Peisen Zhao, Zichen Wang, Yuhang Zhang, Yaoming Wang, Jin Li, Wenrui Dai, Junni Zou, Hongkai Xiong, Qi Tian, Xiaopeng Zhang
Vision-language foundation models, represented by Contrastive language-image pre-training (CLIP), have gained increasing attention for jointly understanding both vision and textual tasks.
1 code implementation • ICCV 2023 • Shuangrui Ding, Peisen Zhao, Xiaopeng Zhang, Rui Qian, Hongkai Xiong, Qi Tian
Based on the STA score, we are able to progressively prune the tokens without introducing any additional parameters or requiring further re-training.
no code implementations • 21 Mar 2023 • Chen Ju, Zeqian Li, Peisen Zhao, Ya zhang, Xiaopeng Zhang, Qi Tian, Yanfeng Wang, Weidi Xie
In this paper, we consider the problem of temporal action localization under low-shot (zero-shot & few-shot) scenario, with the goal of detecting and classifying the action instances from arbitrary categories within some untrimmed videos, even not seen at training time.
no code implementations • 20 Feb 2023 • Chen Ju, Haicheng Wang, Jinxiang Liu, Chaofan Ma, Ya zhang, Peisen Zhao, Jianlong Chang, Qi Tian
Temporal sentence grounding aims to detect the event timestamps described by the natural language query from given untrimmed videos.
no code implementations • CVPR 2023 • Chen Ju, Kunhao Zheng, Jinxiang Liu, Peisen Zhao, Ya zhang, Jianlong Chang, Yanfeng Wang, Qi Tian
And as a result, the dual-branch complementarity is effectively fused to promote a strong alliance.
Weakly-supervised Temporal Action Localization Weakly Supervised Temporal Action Localization
no code implementations • 6 Apr 2021 • Chen Ju, Peisen Zhao, Siheng Chen, Ya zhang, Xiaoyun Zhang, Qi Tian
To solve this issue, we introduce an adaptive mutual supervision framework (AMS) with two branches, where the base branch adopts CAS to localize the most discriminative action regions, while the supplementary branch localizes the less discriminative action regions through a novel adaptive sampler.
Ranked #7 on Weakly Supervised Action Localization on THUMOS14
Weakly Supervised Action Localization Weakly-supervised Temporal Action Localization +1
no code implementations • ICCV 2021 • Chen Ju, Peisen Zhao, Siheng Chen, Ya zhang, Yanfeng Wang, Qi Tian
Single-frame temporal action localization (STAL) aims to localize actions in untrimmed videos with only one timestamp annotation for each action instance.
no code implementations • 15 Dec 2020 • Chen Ju, Peisen Zhao, Ya zhang, Yanfeng Wang, Qi Tian
Point-Level temporal action localization (PTAL) aims to localize actions in untrimmed videos with only one timestamp annotation for each action instance.
Ranked #3 on Weakly Supervised Action Localization on BEOID
no code implementations • 9 Dec 2020 • Chaoqin Huang, Fei Ye, Peisen Zhao, Ya zhang, Yan-Feng Wang, Qi Tian
This paper explores semi-supervised anomaly detection, a more practical setting for anomaly detection where a small additional set of labeled samples are provided.
Ranked #25 on Anomaly Detection on One-class CIFAR-10 (using extra training data)
no code implementations • 18 Nov 2020 • Peisen Zhao, Lingxi Xie, Ya zhang, Yanfeng Wang, Qi Tian
Knowledge distillation is employed to transfer the privileged information from the offline teacher to the online student.
Ranked #11 on Online Action Detection on TVSeries
no code implementations • 13 Jul 2020 • Peisen Zhao, Lingxi Xie, Ya zhang, Qi Tian
The U2S framework is composed of three subnetworks: a universal network, a category-specific network, and a mask network.
1 code implementation • ECCV 2020 • Peisen Zhao, Lingxi Xie, Chen Ju, Ya zhang, Yan-Feng Wang, Qi Tian
To alleviate this problem, we introduce two regularization terms to mutually regularize the learning procedure: the Intra-phase Consistency (IntraC) regularization is proposed to make the predictions verified inside each phase; and the Inter-phase Consistency (InterC) regularization is proposed to keep consistency between these phases.