Search Results for author: Can Zhang

Found 24 papers, 10 papers with code

MLP-AMDC: An MLP Architecture for Adaptive-Mask-based Dual-Camera snapshot hyperspectral imaging

1 code implementation12 Oct 2023 Zeyu Cai, Can Zhang, Xunhao Chen, Shanghuan Liu, Chengqian Jin, Feipeng Da

In order to improve the inference speed of the reconstruction network, this paper proposes An MLP Architecture for Adaptive-Mask-based Dual-Camera (MLP-AMDC) to replace the transformer structure of the network.

GeT: Generative Target Structure Debiasing for Domain Adaptation

no code implementations ICCV 2023 Can Zhang, Gim Hee Lee

Despite the competitive performance, these pseudo labeling methods rely heavily on the source domain to generate pseudo labels for the target domain and therefore still suffer considerably from source data bias.

Domain Adaptation Pseudo Label

Improving Scene Graph Generation with Superpixel-Based Interaction Learning

no code implementations4 Aug 2023 Jingyi Wang, Can Zhang, Jinfa Huang, Botao Ren, Zhidong Deng

(ii) We explore intra-entity and cross-entity interactions among the superpixels to enrich fine-grained interactions between entities at an earlier stage.

Graph Generation Scene Graph Generation +1

3D-IDS: Doubly Disentangled Dynamic Intrusion Detection

no code implementations2 Jul 2023 Chenyang Qiu, Yingsheng Geng, Junrui Lu, Kaida Chen, Shitong Zhu, Ya Su, Guoshun Nan, Can Zhang, Junsong Fu, Qimei Cui, Xiaofeng Tao

This motivates us to propose 3D-IDS, a novel method that aims to tackle the above issues through two-step feature disentanglements and a dynamic graph diffusion scheme.

Intrusion Detection

Cross-Modality Time-Variant Relation Learning for Generating Dynamic Scene Graphs

1 code implementation15 May 2023 Jingyi Wang, Jinfa Huang, Can Zhang, Zhidong Deng

In this paper, we propose a Time-variant Relation-aware TRansformer (TR$^2$), which aims to model the temporal change of relations in dynamic scene graphs.

Relation Scene Graph Generation +1

Indeterminate Probability Neural Network

1 code implementation21 Mar 2023 Tao Yang, Chuang Liu, Xiaofeng Ma, Weijia Lu, Ning Wu, Bingyang Li, Zhifei Yang, Peng Liu, Lin Sun, Xiaodong Zhang, Can Zhang

Besides, for our proposed neural network framework, the output of neural network is defined as probability events, and based on the statistical analysis of these events, the inference model for classification task is deduced.

Classification

Iterative Proposal Refinement for Weakly-Supervised Video Grounding

no code implementations CVPR 2023 Meng Cao, Fangyun Wei, Can Xu, Xiubo Geng, Long Chen, Can Zhang, Yuexian Zou, Tao Shen, Daxin Jiang

Weakly-Supervised Video Grounding (WSVG) aims to localize events of interest in untrimmed videos with only video-level annotations.

Sentence Video Grounding

Unsupervised Feature Representation Learning for Domain-generalized Cross-domain Image Retrieval

1 code implementation ICCV 2023 Conghui Hu, Can Zhang, Gim Hee Lee

This limitation motivates us to present the first attempt at domain-generalized unsupervised cross-domain image retrieval (DG-UCDIR) aiming at facilitating image retrieval between any two unseen domains in an unsupervised way.

Contrastive Learning Image Retrieval +2

LocVTP: Video-Text Pre-training for Temporal Localization

1 code implementation21 Jul 2022 Meng Cao, Tianyu Yang, Junwu Weng, Can Zhang, Jue Wang, Yuexian Zou

To further enhance the temporal reasoning ability of the learned feature, we propose a context projection head and a temporal aware contrastive loss to perceive the contextual relationships.

Retrieval Temporal Localization +1

CA-UDA: Class-Aware Unsupervised Domain Adaptation with Optimal Assignment and Pseudo-Label Refinement

no code implementations26 May 2022 Can Zhang, Gim Hee Lee

However, source domain bias that deteriorates the pseudo-labels can still exist since the shared network of the source and target domains are typically used for the pseudo-label selections.

Image Classification Missing Labels +2

SpatioTemporal Focus for Skeleton-based Action Recognition

no code implementations31 Mar 2022 Liyu Wu, Can Zhang, Yuexian Zou

Inspired by the recent attention mechanism, we propose a multi-grain contextual focus module, termed MCF, to capture the action associated relation information from the body joints and parts.

Action Recognition Skeleton Based Action Recognition

Unsupervised Pre-training for Temporal Action Localization Tasks

1 code implementation CVPR 2022 Can Zhang, Tianyu Yang, Junwu Weng, Meng Cao, Jue Wang, Yuexian Zou

These pre-trained models can be sub-optimal for temporal localization tasks due to the inherent discrepancy between video-level classification and clip-level localization.

Contrastive Learning Representation Learning +4

MISS: Multi-Interest Self-Supervised Learning Framework for Click-Through Rate Prediction

no code implementations30 Nov 2021 Wei Guo, Can Zhang, ZhiCheng He, Jiarui Qin, Huifeng Guo, Bo Chen, Ruiming Tang, Xiuqiang He, Rui Zhang

With the help of two novel CNN-based multi-interest extractors, self-supervision signals are discovered with full considerations of different interest representations (point-wise and union-wise), interest dependencies (short-range and long-range), and interest correlations (inter-item and intra-item).

Click-Through Rate Prediction Contrastive Learning +3

On Pursuit of Designing Multi-modal Transformer for Video Grounding

no code implementations EMNLP 2021 Meng Cao, Long Chen, Mike Zheng Shou, Can Zhang, Yuexian Zou

Almost all existing video grounding methods fall into two frameworks: 1) Top-down model: It predefines a set of segment candidates and then conducts segment classification and regression.

Sentence Video Grounding

Long-Short Temporal Modeling for Efficient Action Recognition

no code implementations30 Jun 2021 Liyu Wu, Yuexian Zou, Can Zhang

Efficient long-short temporal modeling is key for enhancing the performance of action recognition task.

Action Recognition

SRF-Net: Selective Receptive Field Network for Anchor-Free Temporal Action Detection

no code implementations29 Jun 2021 Ranyu Ning, Can Zhang, Yuexian Zou

Current mainstream one-stage TAD approaches localize and classify action proposals relying on pre-defined anchors, where the location and scale for action instances are set by designers.

Action Detection

All You Need is a Second Look: Towards Arbitrary-Shaped Text Detection

no code implementations24 Jun 2021 Meng Cao, Can Zhang, Dongming Yang, Yuexian Zou

Compared to the traditional single-stage segmentation network, our NASK conducts the detection in a coarse-to-fine manner with the first stage segmentation spotting the rectangle text proposals and the second one retrieving compact representations.

Instance Segmentation Segmentation +2

RR-Net: Injecting Interactive Semantics in Human-Object Interaction Detection

no code implementations30 Apr 2021 Dongming Yang, Yuexian Zou, Can Zhang, Meng Cao, Jie Chen

Upon the frame, an Interaction Intensifier Module and a Correlation Parsing Module are carefully designed, where: a) interactive semantics from humans can be exploited and passed to objects to intensify interactions, b) interactive correlations among humans, objects and interactions are integrated to promote predictions.

Human-Object Interaction Detection Relation

CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning

1 code implementation CVPR 2021 Can Zhang, Meng Cao, Dongming Yang, Jie Chen, Yuexian Zou

In this paper, we argue that learning by comparing helps identify these hard snippets and we propose to utilize snippet Contrastive learning to Localize Actions, CoLA for short.

CoLA Contrastive Learning +3

Multi-Scale Cascading Network with Compact Feature Learning for RGB-Infrared Person Re-Identification

no code implementations12 Dec 2020 Can Zhang, Hong Liu, Wei Guo, Mang Ye

RGB-Infrared person re-identification (RGB-IR Re-ID) aims to match persons from heterogeneous images captured by visible and thermal cameras, which is of great significance in the surveillance system under poor light conditions.

Person Re-Identification

Non-Autoregressive Coarse-to-Fine Video Captioning

1 code implementation27 Nov 2019 Bang Yang, Yuexian Zou, Fenglin Liu, Can Zhang

However, mainstream video captioning methods suffer from slow inference speed due to the sequential manner of autoregressive decoding, and prefer generating generic descriptions due to the insufficient training of visual words (e. g., nouns and verbs) and inadequate decoding paradigm.

Sentence Video Captioning

Cannot find the paper you are looking for? You can Submit a new open access paper.