1 code implementation • 25 Jul 2024 • Shuming Liu, Lin Sui, Chen-Lin Zhang, Fangzhou Mu, Chen Zhao, Bernard Ghanem
As a fundamental task in long-form video understanding, temporal action detection (TAD) aims to capture inherent temporal relations in untrimmed videos and identify candidate actions with precise boundaries.
1 code implementation • CVPR 2024 • Tianhao Zhou, Haipeng Li, Ziyi Wang, Ao Luo, Chen-Lin Zhang, Jiajun Li, Bing Zeng, Shuaicheng Liu
Image stitching from different captures often results in non-rectangular boundaries, which is often considered unappealing.
2 code implementations • CVPR 2024 • Shuming Liu, Chen-Lin Zhang, Chen Zhao, Bernard Ghanem
In this paper, we reduce the memory consumption for end-to-end training, and manage to scale up the TAD backbone to 1 billion parameters and the input video to 1, 536 frames, leading to significant detection performance.
Ranked #1 on Temporal Action Localization on EPIC-KITCHENS-100
1 code implementation • 7 Jun 2022 • Lin Sui, Chen-Lin Zhang, Lixin Gu, Feng Han
Some existing methods build one-stage pipelines, But a large performance drop exists with the vanilla one-stage pipeline and extra classification modules are needed to achieve comparable performance.
no code implementations • 3 Aug 2021 • Chen-Lin Zhang, Yin Li, Jianxin Wu
Modern deep learning models require large amounts of accurately annotated data, which is often difficult to satisfy.
no code implementations • CVPR 2022 • Lin Sui, Chen-Lin Zhang, Jianxin Wu
However, the lack of bounding-box supervision makes its accuracy much lower than fully supervised object detection (FSOD), and currently modern FSOD techniques cannot be applied to WSOD.
1 code implementation • 21 Oct 2020 • ran Xu, Chen-Lin Zhang, Pengcheng Wang, Jayoung Lee, Subrata Mitra, Somali Chaterji, Yin Li, Saurabh Bagchi
In this paper we introduce ApproxDet, an adaptive video object detection framework for mobile devices to meet accuracy-latency requirements in the face of changing content and resource contention scenarios.
1 code implementation • CVPR 2020 • Chen-Lin Zhang, Yun-Hao Cao, Jianxin Wu
Weakly supervised object localization (WSOL) aims to localize objects with only image-level labels.
Ranked #2 on Weakly-Supervised Object Localization on CUB-200-2011 (Top-1 Localization Accuracy metric)
no code implementations • 17 Jun 2019 • Chen-Lin Zhang, Xin-Xin Liu, Jianxin Wu
We show that pre-trained weights on ImageNet improve the accuracy under the real-time action recognition setting.
no code implementations • 11 Dec 2018 • Xiu-Shen Wei, Chen-Lin Zhang, Lingqiao Liu, Chunhua Shen, Jianxin Wu
Inspired by the coarse-to-fine hierarchical process, we propose an end-to-end RNN-based Hierarchical Attention (RNN-HA) classification model for vehicle re-identification.
no code implementations • 20 Jul 2017 • Xiu-Shen Wei, Chen-Lin Zhang, Jianxin Wu, Chunhua Shen, Zhi-Hua Zhou
Reusable model design becomes desirable with the rapid expansion of computer vision and machine learning applications.
Ranked #11 on Single-object discovery on COCO_20k
no code implementations • 8 May 2017 • Xiu-Shen Wei, Chen-Lin Zhang, Yao Li, Chen-Wei Xie, Jianxin Wu, Chunhua Shen, Zhi-Hua Zhou
Reusable model design becomes desirable with the rapid expansion of machine learning applications.
no code implementations • 31 Mar 2016 • Guo-Bing Zhou, Jianxin Wu, Chen-Lin Zhang, Zhi-Hua Zhou
Recently recurrent neural networks (RNN) has been very successful in handling sequence data.