Search Results for author: Zheng Shou

Found 17 papers, 8 papers with code

An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022

no code implementations • 16 Nov 2022 • Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, Wing-Kwong Chan, Chong-Wah Ngo, Zheng Shou, Nan Duan

This technical report describes the CONE approach for Ego4D Natural Language Queries (NLQ) Challenge in ECCV 2022.

Contrastive Learning Natural Language Queries

Paper
Add Code

CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding

1 code implementation • 22 Sep 2022 • Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, Wing-Kwong Chan, Chong-Wah Ngo, Zheng Shou, Nan Duan

This paper tackles an emerging and challenging problem of long video temporal grounding~(VTG) that localizes video moments related to a natural language (NL) query.

Contrastive Learning Video Grounding

Paper
Code

Searching for Two-Stream Models in Multivariate Space for Video Recognition

no code implementations • ICCV 2021 • Xinyu Gong, Heng Wang, Zheng Shou, Matt Feiszli, Zhangyang Wang, Zhicheng Yan

We design a multivariate search space, including 6 search variables to capture a wide variety of choices in designing two-stream models.

Neural Architecture Search Video Recognition +1

Paper
Add Code

SF-Net: Single-Frame Supervision for Temporal Action Localization

1 code implementation • ECCV 2020 • Fan Ma, Linchao Zhu, Yi Yang, Shengxin Zha, Gourab Kundu, Matt Feiszli, Zheng Shou

To obtain the single-frame supervision, the annotators are asked to identify only a single frame within the temporal window of an action.

Ranked #5 on Weakly Supervised Action Localization on BEOID

Weakly Supervised Action Localization

Paper
Code

Towards Train-Test Consistency for Semi-supervised Temporal Action Localization

no code implementations • 24 Oct 2019 • Xudong Lin, Zheng Shou, Shih-Fu Chang

The inconsistent strategy makes it hard to explicitly supervise the action localization model with temporal boundary annotations at training time.

Multiple Instance Learning Video Classification +2

Paper
Add Code

CDSA: Cross-Dimensional Self-Attention for Multivariate, Geo-tagged Time Series Imputation

2 code implementations • 23 May 2019 • Jiawei Ma, Zheng Shou, Alireza Zareian, Hassan Mansour, Anthony Vetro, Shih-Fu Chang

In order to jointly capture the self-attention across multiple dimensions, including time, location and the sensor measurements, while maintain low computational complexity, we propose a novel approach called Cross-Dimensional Self-Attention (CDSA) to process each dimension sequentially, yet in an order-independent manner.

Imputation Machine Translation +2

656

Paper
Code

DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition

no code implementations • CVPR 2019 • Zheng Shou, Xudong Lin, Yannis Kalantidis, Laura Sevilla-Lara, Marcus Rohrbach, Shih-Fu Chang, Zhicheng Yan

Motion has shown to be useful for video understanding, where motion is typically represented by optical flow.

Ranked #1 on Action Recognition on UCF-101

Action Classification Action Recognition In Videos +3

Paper
Add Code

Low-shot Learning via Covariance-Preserving Adversarial Augmentation Networks

no code implementations • NeurIPS 2018 • Hang Gao, Zheng Shou, Alireza Zareian, Hanwang Zhang, Shih-Fu Chang

Deep neural networks suffer from over-fitting and catastrophic forgetting when trained with small data.

Data Augmentation Generative Adversarial Network +1

Paper
Add Code

AutoLoc: Weakly-supervised Temporal Action Localization in Untrimmed Videos

no code implementations • ECCV 2018 • Zheng Shou, Hang Gao, Lei Zhang, Kazuyuki Miyazawa, Shih-Fu Chang

In this paper, we first develop a novel weakly-supervised TAL framework called AutoLoc to directly predict the temporal boundary of each action instance.

Ranked #16 on Weakly Supervised Action Localization on ActivityNet-1.2 (mAP@0.5 metric)

Weakly Supervised Action Localization Weakly-supervised Temporal Action Localization +1

Paper
Add Code

AutoLoc: Weakly-supervised Temporal Action Localization

1 code implementation • 22 Jul 2018 • Zheng Shou, Hang Gao, Lei Zhang, Kazuyuki Miyazawa, Shih-Fu Chang

In this paper, we first develop a novel weakly-supervised TAL framework called AutoLoc to directly predict the temporal boundary of each action instance.

Weakly-supervised Temporal Action Localization Weakly Supervised Temporal Action Localization

Paper
Code

Online Detection of Action Start in Untrimmed, Streaming Videos

no code implementations • ECCV 2018 • Zheng Shou, Junting Pan, Jonathan Chan, Kazuyuki Miyazawa, Hassan Mansour, Anthony Vetro, Xavier Giro-i-Nieto, Shih-Fu Chang

We aim to tackle a novel task in action detection - Online Detection of Action Start (ODAS) in untrimmed, streaming videos.

Action Detection Generative Adversarial Network

Paper
Add Code

Single Shot Temporal Action Detection

2 code implementations • 17 Oct 2017 • Tianwei Lin, Xu Zhao, Zheng Shou

The main drawback of this framework is that the boundaries of action instance proposals have been fixed during the classification step.

Action Detection General Classification

Paper
Code

ConvNet Architecture Search for Spatiotemporal Feature Learning

1 code implementation • 16 Aug 2017 • Du Tran, Jamie Ray, Zheng Shou, Shih-Fu Chang, Manohar Paluri

Learning image representations with ConvNets by pre-training on ImageNet has proven useful across many visual understanding tasks including object detection, semantic segmentation, and image captioning.

Ranked #71 on Action Recognition on HMDB-51

Action Classification Action Recognition +5

Paper
Code

Temporal Convolution Based Action Proposal: Submission to ActivityNet 2017

no code implementations • 21 Jul 2017 • Tianwei Lin, Xu Zhao, Zheng Shou

Our approach achieves the state-of-the-art performances on both temporal action proposal task and temporal action localization task.

Ranked #11 on Temporal Action Proposal Generation on ActivityNet-1.3

Action Classification General Classification +1

Paper
Add Code

CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos

1 code implementation • CVPR 2017 • Zheng Shou, Jonathan Chan, Alireza Zareian, Kazuyuki Miyazawa, Shih-Fu Chang

Temporal action localization is an important yet challenging problem.

Ranked #27 on Temporal Action Localization on THUMOS’14 (mAP IOU@0.6 metric)

Temporal Action Localization

Paper
Code

EventNet Version 1.1 Technical Report

no code implementations • 24 May 2016 • Dongang Wang, Zheng Shou, Hongyi Liu, Shih-Fu Chang

Finally, EventNet version 1. 1 contains 67, 641 videos, 500 events, and 5, 028 event-specific concepts.

Paper
Add Code

Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs

1 code implementation • CVPR 2016 • Zheng Shou, Dongang Wang, Shih-Fu Chang

To address this challenging issue, we exploit the effectiveness of deep networks in temporal action localization via three segment-based 3D ConvNets: (1) a proposal network identifies candidate segments in a long video that may contain actions; (2) a classification network learns one-vs-all action classification model to serve as initialization for the localization network; and (3) a localization network fine-tunes on the learned classification network to localize each action instance.

Ranked #1 on Temporal Action Localization on MEXaction2

Action Classification Classification +3

232

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.