no code implementations • ECCV 2020 • Jin Xie, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao, Mubarak Shah
We further introduce a count-and-similarity branch within the two-stage detection framework, which predicts pedestrian count as well as proposal similarity.
1 code implementation • 27 Nov 2023 • Bin Xie, Jiale Cao, Jin Xie, Fahad Shahbaz Khan, Yanwei Pang
In this paper, we propose a simple encoder-decoder, named SED, for open-vocabulary semantic segmentation, which comprises a hierarchical encoder-based cost map generation and a gradual fusion decoder with category early rejection.
1 code implementation • 26 Jun 2023 • Zhong Ji, Zhihao LI, Yan Zhang, Haoran Wang, Yanwei Pang, Xuelong Li
Afterwards, the VR module is developed to excavate the potential semantic correlations among multiple region-query pairs, which further explores the high-level reasoning similarity.
1 code implementation • 6 Jun 2023 • Hefeng Wang, Jiale Cao, Rao Muhammad Anwer, Jin Xie, Fahad Shahbaz Khan, Yanwei Pang
Our DFormer outperforms the recent diffusion-based panoptic segmentation method Pix2Seq-D with a gain of 3. 6% on MS COCO val2017 set.
no code implementations • 5 Jun 2023 • Xiaohan Liu, Yanwei Pang, Xuebin Sun, Yiming Liu, Yonghong Hou, ZhenChang Wang, Xuelong Li
To address this problem, we propose the following: (1) a novel convolutional operator called Faster Fourier Convolution (FasterFC) to replace the two consecutive convolution operations typically used in convolutional neural networks (e. g., U-Net, ResNet).
no code implementations • 24 Apr 2023 • Hanqing Sun, Yanwei Pang, Jiale Cao, Jin Xie, Xuelong Li
In this paper, we explore the model design of vision Transformers in stereo 3D object detection, focusing particularly on extracting and encoding the task-specific image correspondence information.
no code implementations • 21 Mar 2023 • Zhiqiang Dong, Jiale Cao, Rao Muhammad Anwer, Jin Xie, Fahad Khan, Yanwei Pang
Given a set of sparse and learnable proposals, LEAPS employs a dynamic person search head to directly perform person detection and corresponding re-id feature generation without non-maximum suppression post-processing.
1 code implementation • 9 Feb 2023 • Jiabei Wang, Yanwei Pang, Jiale Cao, Hanqing Sun, Zhuang Shao, Xuelong Li
We hope that our simple intra-image contrastive learning can provide more paradigms on weakly supervised person search.
1 code implementation • 17 Jan 2023 • Yan Zhang, Zhong Ji, Di Wang, Yanwei Pang, Xuelong Li
(2) It limits the scale of negative sample pairs by employing the mini-batch based end-to-end training mechanism.
no code implementations • 11 Aug 2022 • Zhong Ji, Zhishen Hou, Xiyao Liu, Yanwei Pang, Xuelong Li
Few-shot Class-Incremental Learning (FSCIL) aims at learning new concepts continually with only a few samples, which is prone to suffer the catastrophic forgetting and overfitting problems.
no code implementations • 10 Aug 2022 • Xiaoheng Jiang, Xinyi Wu, Hisham Cholakkal, Rao Muhammad Anwer, Jiale Cao Mingliang Xu, Bing Zhou, Yanwei Pang, Fahad Shahbaz Khan
The SkipAgg module directly propagates features with small receptive fields to features with much larger receptive fields.
1 code implementation • CVPR 2022 • Jiale Cao, Yanwei Pang, Rao Muhammad Anwer, Hisham Cholakkal, Jin Xie, Mubarak Shah, Fahad Shahbaz Khan
We propose a novel one-step transformer-based person search framework, PSTR, that jointly performs person detection and re-identification (re-id) in a single architecture.
no code implementations • 11 Mar 2022 • Yiming Liu, Yanwei Pang, Ruiqi Jin, ZhenChang Wang
This paper aims to reducing the scan time by actively and sequentially selecting partial phases in a short time so that a slice can be accurately reconstructed from the resultant slice-specific incomplete K-space matrix.
no code implementations • 11 Mar 2022 • Xiaohan Liu, Yanwei Pang, Ruiqi Jin, Yu Liu, ZhenChang Wang
Purpose: To introduce a dual-domain reconstruction network with V-Net and K-Net for accurate MR image reconstruction from undersampled k-space data.
no code implementations • 28 Nov 2021 • Aqi Gao, Yanwei Pang, Jing Nie, Jiale Cao, Yishun Guo
The key in our ESGN is an efficient geometry-aware feature generation (EGFG) module.
no code implementations • 3 Sep 2021 • Zhong Ji, Zhishen Hou, Xiyao Liu, Yanwei Pang, Jungong Han
Semantic information provides intra-class consistency and inter-class discriminability beyond visual concepts, which has been employed in Few-Shot Learning (FSL) to achieve further gains.
no code implementations • 3 Sep 2021 • Xiyao Liu, Zhong Ji, Yanwei Pang, Zhongfei Zhang
However, the target domain is absolutely unknown during the training on the source domain, which results in lacking directed guidance for target tasks.
cross-domain few-shot learning
Weakly-Supervised Object Localization
no code implementations • 18 Jun 2021 • Aqi Gao, Jiale Cao, Yanwei Pang
Compared with the baseline RTS3D, our proposed method has 2. 57% improvement on AP3d almost without extra network parameters.
1 code implementation • 18 Nov 2020 • Yanwei Pang, Jiale Cao, Yazhao Li, Jin Xie, Hanqing Sun, Jinfeng Gong
In addition, a new diverse pedestrian dataset is further built.
2 code implementations • 1 Oct 2020 • Jiale Cao, Yanwei Pang, Jin Xie, Fahad Shahbaz Khan, Ling Shao
In addition to single-spectral pedestrian detection, we also review multi-spectral pedestrian detection, which provides more robust features for illumination variance.
1 code implementation • ECCV 2020 • Jiale Cao, Rao Muhammad Anwer, Hisham Cholakkal, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao
In terms of real-time capabilities, SipMask outperforms YOLACT with an absolute gain of 3. 0% (mask AP) under similar settings, while operating at comparable speed on a Titan Xp.
Ranked #12 on
Real-time Instance Segmentation
on MSCOCO
1 code implementation • ECCV 2020 • Haoran Wang, Ying Zhang, Zhong Ji, Yanwei Pang, Lin Ma
In this paper, we propose a Consensus-aware Visual-Semantic Embedding (CVSE) model to incorporate the consensus information, namely the commonsense knowledge shared between both modalities, into image-text matching.
1 code implementation • CVPR 2020 • Yanwei Pang, Jing Nie, Jin Xie, Jungong Han, Xuelong Li
On the assumption that dehazed binocular images are superior to the hazy ones for stereo vision tasks such as 3D object detection and according to the fact that image haze is a function of depth, this paper proposes a Binocular image dehazing Network (BidNet) aiming at dehazing both the left and right images of binocular images within the deep learning framework.
1 code implementation • CVPR 2020 • Wenguan Wang, Hailong Zhu, Jifeng Dai, Yanwei Pang, Jianbing Shen, Ling Shao
As human bodies are underlying hierarchically structured, how to model human structures is the central theme in this task.
no code implementations • 25 Jan 2020 • Jin Xie, Yanwei Pang, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Ling Shao
On the heavy occluded (\textbf{HO}) set of CityPerosns test set, our PSC-Net obtains an absolute gain of 4. 0\% in terms of log-average miss rate over the state-of-the-art with same backbone, input scale and without using additional VBB supervision.
1 code implementation • ICCV 2019 • Wenguan Wang, Zhijie Zhang, Siyuan Qi, Jianbing Shen, Yanwei Pang, Ling Shao
The bottom-up and top-down inferences explicitly model the compositional and decompositional relations in human bodies, respectively.
no code implementations • CVPR 2020 • Yazhao Li, Yanwei Pang, Jianbing Shen, Jiale Cao, Ling Shao
With this observation, we propose a new Neighbor Erasing and Transferring (NET) mechanism to reconfigure the pyramid features and explore scale-aware features.
no code implementations • ICCV 2019 • Tiancai Wang, Rao Muhammad Anwer, Muhammad Haris Khan, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao, Jorma Laaksonen
Our approach outperforms the state-of-the-art on all datasets.
1 code implementation • ICCV 2019 • Yanwei Pang, Jin Xie, Muhammad Haris Khan, Rao Muhammad Anwer, Fahad Shahbaz Khan, Ling Shao
Our approach obtains an absolute gain of 9. 5% in log-average miss rate, compared to the best reported results on the heavily occluded (HO) pedestrian set of CityPersons test set.
no code implementations • ICCV 2019 • Yanwei Pang, Yazhao Li, Jianbing Shen, Ling Shao
By embedding these two strategies, we construct a parallel feature pyramid towards improving multi-level feature fusion.
no code implementations • 26 Aug 2019 • Zhong Ji, Xuejie Yu, Yunlong Yu, Yanwei Pang, Zhongfei Zhang
Towards alleviating the class imbalance issue in ZSC, we propose a sample-balanced training process to encourage all training classes to contribute equally to the learned model.
1 code implementation • 25 Jul 2019 • Zhijie Zhang, Huazhu Fu, Hang Dai, Jianbing Shen, Yanwei Pang, Ling Shao
Segmentation is a fundamental task in medical image analysis.
Ranked #1 on
Optic Disc Segmentation
on REFUGE
1 code implementation • CVPR 2019 • Yanwei Pang, Tiancai Wang, Rao Muhammad Anwer, Fahad Shahbaz Khan, Ling Shao
The performance of our detector is validated on two benchmarks: PASCAL VOC and MS COCO.
no code implementations • ICCV 2019 • Zhong Ji, Haoran Wang, Jungong Han, Yanwei Pang
Concretely, the saliency detector provides the visual saliency information as the guidance for the two attention modules.
1 code implementation • NeurIPS 2018 • Yunlong Yu, Zhong Ji, Yanwei Fu, Jichang Guo, Yanwei Pang, Zhongfei (Mark) Zhang
Zero-Shot Learning (ZSL) is generally achieved via aligning the semantic relationships between the visual features and the corresponding class semantic descriptions.
no code implementations • 20 Nov 2018 • Yunlong Yu, Zhong Ji, Yanwei Pang, Jichang Guo, Zhongfei Zhang, Fei Wu
Existing generative Zero-Shot Learning (ZSL) methods only consider the unidirectional alignment from the class semantics to the visual features while ignoring the alignment from the visual features to the class semantics, which fails to construct the visual-semantic interactions well.
no code implementations • CVPR 2019 • Jiale Cao, Yanwei Pang, Xuelong. Li
Experimental results on the VOC2007 and VOC2012 datasets demonstrate that the proposed TripleNet is able to improve both the detection and segmentation accuracies without adding extra computational costs.
Ranked #18 on
Semantic Segmentation
on PASCAL VOC 2012 test
no code implementations • 21 May 2018 • Yunlong Yu, Zhong Ji, Yanwei Fu, Jichang Guo, Yanwei Pang, Zhongfei Zhang
To this end, we propose a novel stacked semantics-guided attention (S2GA) model to obtain semantic relevant features by using individual class semantic features to progressively guide the visual features to generate an attention map for weighting the importance of different local regions.
no code implementations • 3 Apr 2018 • Jiale Cao, Yanwei Pang, Xuelong. Li
In this paper, we propose a multi-branch and high-level semantic network by gradually splitting a base network into multiple different branches.
no code implementations • 21 Mar 2018 • Chongyi Li, Jichang Guo, Fatih Porikli, Huazhu Fu, Yanwei Pang
Different from previous learning-based methods, we propose a flexible cascaded CNN for single hazy image restoration, which considers the medium transmission and global atmospheric light jointly by two task-driven subnetworks.
no code implementations • 6 Feb 2018 • Zhong Ji, Yuxin Sun, Yunlong Yu, Yanwei Pang, Jungong Han
To address the Cross-Modal Zero-Shot Hashing (CMZSH) retrieval task, we propose a novel Attribute-Guided Network (AgNet), which can perform not only IBIR, but also Text-Based Image Retrieval (TBIR).
no code implementations • 9 Sep 2017 • Yanwei Pang, Bo Zhou, Feiping Nie
It is interesting that the optimal regularization parameter is adaptive to the neighbors in low-dimensional space and has intuitive meaning.
no code implementations • 31 Aug 2017 • Zhong Ji, Kailin Xiong, Yanwei Pang, Xuelong. Li
This paper addresses the problem of supervised video summarization by formulating it as a sequence-to-sequence learning problem, where the input is a sequence of original video frames, the output is a keyshot sequence.
Ranked #4 on
Video Summarization
on TvSum
no code implementations • 13 Jul 2017 • Zhong Ji, Yaru Ma, Yanwei Pang, Xuelong. Li
Given the explosive growth of online videos, it is becoming increasingly important to relieve the tedious work of browsing and managing the video content of interest.
no code implementations • 22 May 2017 • Zhong Ji, Yunxin Sun, Yulong Yu, Jichang Guo, Yanwei Pang
However, the visual features and the class semantic descriptors locate in different structural spaces, a linear or bilinear model can not capture the semantic interactions between different modalities well.
no code implementations • 27 Mar 2017 • Yunlong Yu, Zhong Ji, Jichang Guo, Yanwei Pang
Two fundamental challenges in it are visual-semantic embedding and domain adaptation in cross-modality learning and unseen class prediction steps, respectively.
no code implementations • 30 Jun 2016 • Zhong Ji, Yuzhong Xie, Yanwei Pang, Lei Chen, Zhongfei Zhang
Zero-shot learning (ZSL) extends the conventional image classification technique to a more challenging situation where the test image categories are not seen in the training samples.
no code implementations • 22 Mar 2016 • Yanwei Pang, Manli Sun, Xiaoheng Jiang, Xuelong. Li
In this paper, we propose to replace dense shallow MLP with sparse shallow MLP.
no code implementations • 1 Mar 2016 • Jiale Cao, Yanwei Pang, Xuelong. Li
For example, CNN classifies these proposals by the full-connected layer features while proposal scores and the features in the inner-layers of CNN are ignored.
Ranked #25 on
Pedestrian Detection
on Caltech
no code implementations • 1 Mar 2016 • Xiaoheng Jiang, Yanwei Pang, Manli Sun, Xuelong. Li
The first one is a linear filter of spatial size $ h\times w $ and is aimed at extracting features from spatial domain.
no code implementations • CVPR 2016 • Jiale Cao, Yanwei Pang, Xuelong. Li
Finally, we propose to combine both non-neighboring and neighboring features for pedestrian detection.
Ranked #28 on
Pedestrian Detection
on Caltech
no code implementations • 30 Sep 2015 • Yanwei Pang, Li Ye, Xuelong. Li, Jing Pan
So there are undesirable false alarms and missed alarms in many algorithms of moving object detection.
no code implementations • 23 Aug 2015 • Yanwei Pang, Jiale Cao, Xuelong. Li
Multistage particle windows (MPW), proposed by Gualdi et al., is an algorithm of fast and accurate object detection.
no code implementations • 18 Aug 2015 • Yanwei Pang, Jiale Cao, Xuelong. Li
iCascade searches the optimal number ri of weak classifiers of each stage i by directly minimizing the computation cost of the cascade.
no code implementations • CVPR 2013 • Qiang Hao, Rui Cai, Zhiwei Li, Lei Zhang, Yanwei Pang, Feng Wu, Yong Rui
3D model-based object recognition has been a noticeable research trend in recent years.