1 code implementation • 21 Mar 2023 • Tao Yang, Chuang Liu, Xiaofeng Ma, Weijia Lu, Ning Wu, Bingyang Li, Zhifei Yang, Peng Liu, Lin Sun, Xiaodong Zhang, Can Zhang
Besides, for our proposed neural network framework, the output of neural network is defined as probability events, and based on the statistical analysis of these events, the inference model for classification task is deduced.
1 code implementation • 21 Jul 2022 • Meng Cao, Tianyu Yang, Junwu Weng, Can Zhang, Jue Wang, Yuexian Zou
To further enhance the temporal reasoning ability of the learned feature, we propose a context projection head and a temporal aware contrastive loss to perceive the contextual relationships.
no code implementations • 26 May 2022 • Can Zhang, Gim Hee Lee
However, source domain bias that deteriorates the pseudo-labels can still exist since the shared network of the source and target domains are typically used for the pseudo-label selections.
no code implementations • 31 Mar 2022 • Liyu Wu, Can Zhang, Yuexian Zou
Inspired by the recent attention mechanism, we propose a multi-grain contextual focus module, termed MCF, to capture the action associated relation information from the body joints and parts.
1 code implementation • CVPR 2022 • Can Zhang, Tianyu Yang, Junwu Weng, Meng Cao, Jue Wang, Yuexian Zou
These pre-trained models can be sub-optimal for temporal localization tasks due to the inherent discrepancy between video-level classification and clip-level localization.
no code implementations • 30 Nov 2021 • Wei Guo, Can Zhang, ZhiCheng He, Jiarui Qin, Huifeng Guo, Bo Chen, Ruiming Tang, Xiuqiang He, Rui Zhang
With the help of two novel CNN-based multi-interest extractors, self-supervision signals are discovered with full considerations of different interest representations (point-wise and union-wise), interest dependencies (short-range and long-range), and interest correlations (inter-item and intra-item).
no code implementations • EMNLP 2021 • Meng Cao, Long Chen, Mike Zheng Shou, Can Zhang, Yuexian Zou
Almost all existing video grounding methods fall into two frameworks: 1) Top-down model: It predefines a set of segment candidates and then conducts segment classification and regression.
no code implementations • 12 Aug 2021 • Meng Cao, Can Zhang, Long Chen, Mike Zheng Shou, Yuexian Zou
In this paper, we analyze that the motion cues behind the optical flow features are complementary informative.
Optical Flow Estimation Weakly-supervised Temporal Action Localization +1
no code implementations • 30 Jun 2021 • Liyu Wu, Yuexian Zou, Can Zhang
Efficient long-short temporal modeling is key for enhancing the performance of action recognition task.
no code implementations • 29 Jun 2021 • Ranyu Ning, Can Zhang, Yuexian Zou
Current mainstream one-stage TAD approaches localize and classify action proposals relying on pre-defined anchors, where the location and scale for action instances are set by designers.
no code implementations • 24 Jun 2021 • Meng Cao, Can Zhang, Dongming Yang, Yuexian Zou
Compared to the traditional single-stage segmentation network, our NASK conducts the detection in a coarse-to-fine manner with the first stage segmentation spotting the rectangle text proposals and the second one retrieving compact representations.
no code implementations • 30 Apr 2021 • Dongming Yang, Yuexian Zou, Can Zhang, Meng Cao, Jie Chen
Upon the frame, an Interaction Intensifier Module and a Correlation Parsing Module are carefully designed, where: a) interactive semantics from humans can be exploited and passed to objects to intensify interactions, b) interactive correlations among humans, objects and interactions are integrated to promote predictions.
1 code implementation • CVPR 2021 • Can Zhang, Meng Cao, Dongming Yang, Jie Chen, Yuexian Zou
In this paper, we argue that learning by comparing helps identify these hard snippets and we propose to utilize snippet Contrastive learning to Localize Actions, CoLA for short.
no code implementations • 12 Dec 2020 • Can Zhang, Hong Liu, Wei Guo, Mang Ye
RGB-Infrared person re-identification (RGB-IR Re-ID) aims to match persons from heterogeneous images captured by visible and thermal cameras, which is of great significance in the surveillance system under poor light conditions.
2 code implementations • 8 Aug 2020 • Can Zhang, Yuexian Zou, Guang Chen, Lei Gan
In contrast to optical flow, our PA focuses more on distilling the motion information at boundaries.
Ranked #2 on Action Recognition on Jester (Gesture Recognition)
1 code implementation • 27 Nov 2019 • Bang Yang, Yuexian Zou, Fenglin Liu, Can Zhang
However, mainstream video captioning methods suffer from slow inference speed due to the sequential manner of autoregressive decoding, and prefer generating generic descriptions due to the insufficient training of visual words (e. g., nouns and verbs) and inadequate decoding paradigm.
1 code implementation • 9 May 2019 • Mohammadreza Zolfaghari, Özgün Çiçek, Syed Mohsin Ali, Farzaneh Mahdisoltani, Can Zhang, Thomas Brox
Foreseeing the future is one of the key factors of intelligence.