1 code implementation • 27 Nov 2023 • Bin Xie, Jiale Cao, Jin Xie, Fahad Shahbaz Khan, Yanwei Pang
In this paper, we propose a simple encoder-decoder, named SED, for open-vocabulary semantic segmentation, which comprises a hierarchical encoder-based cost map generation and a gradual fusion decoder with category early rejection.
no code implementations • 22 Sep 2023 • Xiaoheng Jiang, Kaiyi Guo, Yang Lu, Feng Yan, Hao liu, Jiale Cao, Mingliang Xu, DaCheng Tao
To address these issues, we propose a transformer network with multi-stage CNN (Convolutional Neural Network) feature injection for surface defect segmentation, which is a UNet-like structure named CINFormer.
no code implementations • 22 Sep 2023 • Feng Yan, Xiaoheng Jiang, Yang Lu, Lisha Cui, Shupan Li, Jiale Cao, Mingliang Xu, DaCheng Tao
To this end, we develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure.
1 code implementation • 9 Sep 2023 • Chao Qin, Jiale Cao, Huazhu Fu, Rao Muhammad Anwer, Fahad Shahbaz Khan
Existing video-based breast lesion detection approaches typically perform temporal feature aggregation of deep backbone features based on the self-attention operation.
1 code implementation • 6 Jun 2023 • Hefeng Wang, Jiale Cao, Rao Muhammad Anwer, Jin Xie, Fahad Shahbaz Khan, Yanwei Pang
Our DFormer outperforms the recent diffusion-based panoptic segmentation method Pix2Seq-D with a gain of 3. 6% on MS COCO val2017 set.
no code implementations • 24 Apr 2023 • Hanqing Sun, Yanwei Pang, Jiale Cao, Jin Xie, Xuelong Li
In this paper, we explore the model design of vision Transformers in stereo 3D object detection, focusing particularly on extracting and encoding the task-specific image correspondence information.
no code implementations • 21 Mar 2023 • Zhiqiang Dong, Jiale Cao, Rao Muhammad Anwer, Jin Xie, Fahad Khan, Yanwei Pang
Given a set of sparse and learnable proposals, LEAPS employs a dynamic person search head to directly perform person detection and corresponding re-id feature generation without non-maximum suppression post-processing.
1 code implementation • 9 Feb 2023 • Jiabei Wang, Yanwei Pang, Jiale Cao, Hanqing Sun, Zhuang Shao, Xuelong Li
We hope that our simple intra-image contrastive learning can provide more paradigms on weakly supervised person search.
1 code implementation • 8 Aug 2022 • Jean Lahoud, Jiale Cao, Fahad Shahbaz Khan, Hisham Cholakkal, Rao Muhammad Anwer, Salman Khan, Ming-Hsuan Yang
The success of the transformer architecture in natural language processing has recently triggered attention in the computer vision field.
1 code implementation • CVPR 2022 • Jiale Cao, Yanwei Pang, Rao Muhammad Anwer, Hisham Cholakkal, Jin Xie, Mubarak Shah, Fahad Shahbaz Khan
We propose a novel one-step transformer-based person search framework, PSTR, that jointly performs person detection and re-identification (re-id) in a single architecture.
1 code implementation • 24 Mar 2022 • Omkar Thawakar, Sanath Narayan, Jiale Cao, Hisham Cholakkal, Rao Muhammad Anwer, Muhammad Haris Khan, Salman Khan, Michael Felsberg, Fahad Shahbaz Khan
When using the ResNet50 backbone, our MS-STS achieves a mask AP of 50. 1 %, outperforming the best reported results in literature by 2. 7 % and by 4. 8 % at higher overlap threshold of AP_75, while being comparable in model size and speed on Youtube-VIS 2019 val.
no code implementations • 28 Nov 2021 • Aqi Gao, Yanwei Pang, Jing Nie, Jiale Cao, Yishun Guo
The key in our ESGN is an efficient geometry-aware feature generation (EGFG) module.
no code implementations • 18 Jun 2021 • Aqi Gao, Jiale Cao, Yanwei Pang
Compared with the baseline RTS3D, our proposed method has 2. 57% improvement on AP3d almost without extra network parameters.
1 code implementation • CVPR 2021 • Jialian Wu, Jiale Cao, Liangchen Song, Yu Wang, Ming Yang, Junsong Yuan
Most online multi-object trackers perform object detection stand-alone in a neural net without any input from tracking.
Ranked #1 on
Instance Segmentation
on nuScenes
1 code implementation • 3 Dec 2020 • Tiancai Wang, Tong Yang, Jiale Cao, Xiangyu Zhang
Object detectors usually achieve promising results with the supervision of complete instance annotations.
1 code implementation • 18 Nov 2020 • Yanwei Pang, Jiale Cao, Yazhao Li, Jin Xie, Hanqing Sun, Jinfeng Gong
In addition, a new diverse pedestrian dataset is further built.
2 code implementations • 1 Oct 2020 • Jiale Cao, Yanwei Pang, Jin Xie, Fahad Shahbaz Khan, Ling Shao
In addition to single-spectral pedestrian detection, we also review multi-spectral pedestrian detection, which provides more robust features for illumination variance.
1 code implementation • ECCV 2020 • Jiale Cao, Rao Muhammad Anwer, Hisham Cholakkal, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao
In terms of real-time capabilities, SipMask outperforms YOLACT with an absolute gain of 3. 0% (mask AP) under similar settings, while operating at comparable speed on a Titan Xp.
Ranked #12 on
Real-time Instance Segmentation
on MSCOCO
1 code implementation • CVPR 2020 • Jiale Cao, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao
For precise localization, we introduce a dense local regression that predicts multiple dense box offsets for an object proposal.
Ranked #66 on
Instance Segmentation
on COCO test-dev
no code implementations • CVPR 2020 • Yazhao Li, Yanwei Pang, Jianbing Shen, Jiale Cao, Ling Shao
With this observation, we propose a new Neighbor Erasing and Transferring (NET) mechanism to reconfigure the pyramid features and explore scale-aware features.
1 code implementation • ICCV 2019 • Jiale Cao, Yanwei Pang, Jungong Han, Xuelong Li
To further solve the second problem, a hierarchical shot detector (HSD) is proposed, which stacks two ROC modules and one feature enhanced module.
Ranked #3 on
Object Detection
on PASCAL VOC 2007
no code implementations • CVPR 2019 • Jiale Cao, Yanwei Pang, Xuelong. Li
Experimental results on the VOC2007 and VOC2012 datasets demonstrate that the proposed TripleNet is able to improve both the detection and segmentation accuracies without adding extra computational costs.
Ranked #18 on
Semantic Segmentation
on PASCAL VOC 2012 test
no code implementations • 3 Apr 2018 • Jiale Cao, Yanwei Pang, Xuelong. Li
In this paper, we propose a multi-branch and high-level semantic network by gradually splitting a base network into multiple different branches.
no code implementations • 1 Mar 2016 • Jiale Cao, Yanwei Pang, Xuelong. Li
For example, CNN classifies these proposals by the full-connected layer features while proposal scores and the features in the inner-layers of CNN are ignored.
Ranked #25 on
Pedestrian Detection
on Caltech
no code implementations • CVPR 2016 • Jiale Cao, Yanwei Pang, Xuelong. Li
Finally, we propose to combine both non-neighboring and neighboring features for pedestrian detection.
Ranked #28 on
Pedestrian Detection
on Caltech
no code implementations • 23 Aug 2015 • Yanwei Pang, Jiale Cao, Xuelong. Li
Multistage particle windows (MPW), proposed by Gualdi et al., is an algorithm of fast and accurate object detection.
no code implementations • 18 Aug 2015 • Yanwei Pang, Jiale Cao, Xuelong. Li
iCascade searches the optimal number ri of weak classifiers of each stage i by directly minimizing the computation cost of the cascade.