1 code implementation • ICCV 2019 • Ziteng Gao, Li-Min Wang, Gangshan Wu
Spatial downsampling layers are favored in convolutional neural networks (CNNs) to downscale feature maps for larger receptive fields and less memory consumption.
Ranked #147 on Object Detection on COCO test-dev (using extra training data)
no code implementations • ICCV 2021 • Ziteng Gao, LiMin Wang, Gangshan Wu
In this paper, we break the convention of the same training samples for these two heads in dense detectors and explore a novel supervisory paradigm, termed as Mutual Supervision (MuSu), to respectively and mutually assign training samples for the classification and regression head to ensure this consistency.
2 code implementations • CVPR 2022 • Ziteng Gao, LiMin Wang, Bing Han, Sheng Guo
The recent query-based object detectors break this convention by decoding image features with a set of learnable queries.
no code implementations • CVPR 2023 • Tao Wu, Mengqi Cao, Ziteng Gao, Gangshan Wu, LiMin Wang
STMixer is based on two core designs.
1 code implementation • 7 Apr 2023 • Ziteng Gao, Zhan Tong, LiMin Wang, Mike Zheng Shou
In this paper, we challenge this dense paradigm and present a new method, coined SparseFormer, to imitate human's sparse visual recognition in an end-to-end manner.
Sparse Representation-based Classification Video Classification
1 code implementation • 4 Dec 2023 • Ziteng Gao, Zhan Tong, Kevin Qinghong Lin, Joya Chen, Mike Zheng Shou
In this paper, we propose to bootstrap SparseFormers from ViT-based vision foundation models in a simple and efficient way.
no code implementations • 15 Apr 2024 • Tao Wu, Mengqi Cao, Ziteng Gao, Gangshan Wu, LiMin Wang
First, we present a query-based adaptive feature sampling module, which endows the detector with the flexibility of mining a group of discriminative features from the entire spatio-temporal domain.
1 code implementation • ECCV 2020 • Zhenzhi Wang, Ziteng Gao, Li-Min Wang, Zhifeng Li, Gangshan Wu
To address these problems, we present a new boundary-aware cascade network by introducing two novel components.
Ranked #14 on Action Segmentation on GTEA