Search Results for author: Jiaming Zhou

Found 8 papers, 2 papers with code

Rethinking CLIP-based Video Learners in Cross-Domain Open-Vocabulary Action Recognition

1 code implementation3 Mar 2024 Kun-Yu Lin, Henghui Ding, Jiaming Zhou, Yi-Xing Peng, Zhilin Zhao, Chen Change Loy, Wei-Shi Zheng

To answer this, we establish a CROSS-domain Open-Vocabulary Action recognition benchmark named XOV-Action, and conduct a comprehensive evaluation of five state-of-the-art CLIP-based video learners under various types of domain gaps.

Open Vocabulary Action Recognition

ActionHub: A Large-scale Action Video Description Dataset for Zero-shot Action Recognition

no code implementations22 Jan 2024 Jiaming Zhou, Junwei Liang, Kun-Yu Lin, Jinrui Yang, Wei-Shi Zheng

With the proposed ActionHub dataset, we further propose a novel Cross-modality and Cross-action Modeling (CoCo) framework for ZSAR, which consists of a Dual Cross-modality Alignment module and a Cross-action Invariance Mining module.

Action Recognition Video Description +1

GeoDeformer: Geometric Deformable Transformer for Action Recognition

no code implementations29 Nov 2023 Jinhui Ye, Jiaming Zhou, Hui Xiong, Junwei Liang

Specifically, at the core of GeoDeformer is the Geometric Deformation Predictor, a module designed to identify and quantify potential spatial and temporal geometric deformations within the given video.

Action Recognition

PostRainBench: A comprehensive benchmark and a new model for precipitation forecasting

1 code implementation4 Oct 2023 Yujin Tang, Jiaming Zhou, Xiang Pan, Zeying Gong, Junwei Liang

To address these limitations, we introduce the PostRainBench, a comprehensive multi-variable NWP post-processing benchmark consisting of three datasets for NWP post-processing-based precipitation forecasting.

NWP Post-processing Precipitation Forecasting

CIF-T: A Novel CIF-based Transducer Architecture for Automatic Speech Recognition

no code implementations26 Jul 2023 Tian-Hao Zhang, Dinghao Zhou, Guiping Zhong, Jiaming Zhou, Baoxiang Li

RNN-T models are widely used in ASR, which rely on the RNN-T loss to achieve length alignment between input audio and target sequence.

Automatic Speech Recognition speech-recognition +1

MADI: Inter-domain Matching and Intra-domain Discrimination for Cross-domain Speech Recognition

no code implementations22 Feb 2023 Jiaming Zhou, Shiwan Zhao, Ning Jiang, Guoqing Zhao, Yong Qin

Unsupervised domain adaptation (UDA) aims to improve the performance on the unlabeled target domain by transferring knowledge from the source to the target domain.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Graph-Based High-Order Relation Modeling for Long-Term Action Recognition

no code implementations CVPR 2021 Jiaming Zhou, Kun-Yu Lin, Haoxin Li, Wei-Shi Zheng

In this paper, we propose a Graph-based High-order Relation Modeling (GHRM) module to exploit the high-order relations in the long-term actions for long-term action recognition.

Action Recognition Long-video Activity Recognition +2

Cannot find the paper you are looking for? You can Submit a new open access paper.