Search Results for author: Yanwei Pang

Found 67 papers, 23 papers with code

Count- and Similarity-aware R-CNN for Pedestrian Detection

no code implementations ECCV 2020 Jin Xie, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao, Mubarak Shah

We further introduce a count-and-similarity branch within the two-stage detection framework, which predicts pedestrian count as well as proposal similarity.

Human Instance Segmentation Pedestrian Detection +1

Multi-Stage Knowledge Integration of Vision-Language Models for Continual Learning

no code implementations11 Nov 2024 Hongsheng Zhang, Zhong Ji, Jingren Liu, Yanwei Pang, Jungong Han

One is the popularly adopted single-teacher paradigm fails to impart comprehensive knowledge, The other is the existing methods inadequately leverage the multimodal information in the original training dataset, instead they rely on additional data for distillation, which increases computational and storage overhead.

Continual Learning

A Fresh Look at Generalized Category Discovery through Non-negative Matrix Factorization

no code implementations29 Oct 2024 Zhong Ji, Shuo Yang, Jingren Liu, Yanwei Pang, Jungong Han

Generalized Category Discovery (GCD) aims to classify both base and novel images using labeled base data.

Contrastive Learning

iSeg: An Iterative Refinement-based Framework for Training-free Segmentation

1 code implementation5 Sep 2024 Lin Sun, Jiale Cao, Jin Xie, Fahad Shahbaz Khan, Yanwei Pang

Leveraging the entropy-reduced self-attention module, our iSeg stably improves refined cross-attention map with iterative refinement.

Image Generation Segmentation +1

Parameter-Efficient Fine-Tuning for Continual Learning: A Neural Tangent Kernel Perspective

no code implementations24 Jul 2024 Jingren Liu, Zhong Ji, Yunlong Yu, Jiale Cao, Yanwei Pang, Jungong Han, Xuelong Li

This work provides a theoretical foundation for understanding and improving PEFT-CL models, offering insights into the interplay between feature representation, task orthogonality, and generalization, contributing to the development of more efficient continual learning systems.

Continual Learning parameter-efficient fine-tuning

Raformer: Redundancy-Aware Transformer for Video Wire Inpainting

1 code implementation24 Apr 2024 Zhong Ji, Yimu Su, Yan Zhang, Jiacheng Hou, Yanwei Pang, Jungong Han

Video Wire Inpainting (VWI) is a prominent application in video inpainting, aimed at flawlessly removing wires in films or TV series, offering significant time and labor savings compared to manual frame-by-frame removal.

Video Inpainting

VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection

no code implementations15 Apr 2024 Bonan Ding, Jin Xie, Jing Nie, Jiale Cao, Xuelong Li, Yanwei Pang

Therefore, an effective solution involves transforming monocular images into LiDAR-like representations and employing a LiDAR-based 3D object detector to predict the 3D coordinates of objects.

Autonomous Driving Monocular 3D Object Detection +2

Implicit and Explicit Language Guidance for Diffusion-based Visual Perception

no code implementations11 Apr 2024 Hefeng Wang, Jiale Cao, Jin Xie, Aiping Yang, Yanwei Pang

The explicit branch utilizes the ground-truth labels of corresponding images as text prompts to condition feature extraction of diffusion model.

Depth Estimation Image Generation +1

NTK-Guided Few-Shot Class Incremental Learning

1 code implementation19 Mar 2024 Jingren Liu, Zhong Ji, Yanwei Pang, Yunlong Yu

Through the combined effects of these measures, our network acquires robust NTK properties, ensuring optimal convergence and stability of the NTK matrix and minimizing the NTK-related generalization loss, significantly enhancing its theoretical generalization.

class-incremental learning Few-Shot Class-Incremental Learning +2

CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation

1 code implementation19 Mar 2024 Wenqi Zhu, Jiale Cao, Jin Xie, Shuangming Yang, Yanwei Pang

The experiments are performed on various video instance segmentation datasets, which demonstrate the effectiveness of our proposed method, especially for novel categories.

Decoder Instance Segmentation +5

Joint Attention-Guided Feature Fusion Network for Saliency Detection of Surface Defects

no code implementations5 Feb 2024 Xiaoheng Jiang, Feng Yan, Yang Lu, Ke Wang, Shuai Guo, Tianzhu Zhang, Yanwei Pang, Jianwei Niu, Mingliang Xu

To address these issues, we propose a joint attention-guided feature fusion network (JAFFNet) for saliency detection of surface defects based on the encoder-decoder network.

Defect Detection Saliency Detection

SED: A Simple Encoder-Decoder for Open-Vocabulary Semantic Segmentation

1 code implementation CVPR 2024 Bin Xie, Jiale Cao, Jin Xie, Fahad Shahbaz Khan, Yanwei Pang

In this paper, we propose a simple encoder-decoder, named SED, for open-vocabulary semantic segmentation, which comprises a hierarchical encoder-based cost map generation and a gradual fusion decoder with category early rejection.

Decoder Open Vocabulary Semantic Segmentation +2

Hierarchical Matching and Reasoning for Multi-Query Image Retrieval

1 code implementation26 Jun 2023 Zhong Ji, Zhihao LI, Yan Zhang, Haoran Wang, Yanwei Pang, Xuelong Li

Afterwards, the VR module is developed to excavate the potential semantic correlations among multiple region-query pairs, which further explores the high-level reasoning similarity.

Image Retrieval Retrieval

DFormer: Diffusion-guided Transformer for Universal Image Segmentation

1 code implementation6 Jun 2023 Hefeng Wang, Jiale Cao, Rao Muhammad Anwer, Jin Xie, Fahad Shahbaz Khan, Yanwei Pang

Our DFormer outperforms the recent diffusion-based panoptic segmentation method Pix2Seq-D with a gain of 3. 6% on MS COCO val2017 set.

Decoder Denoising +4

Image Reconstruction for Accelerated MR Scan with Faster Fourier Convolutional Neural Networks

no code implementations5 Jun 2023 Xiaohan Liu, Yanwei Pang, Xuebin Sun, Yiming Liu, Yonghong Hou, ZhenChang Wang, Xuelong Li

To address this problem, we propose the following: (1) a novel convolutional operator called Faster Fourier Convolution (FasterFC) to replace the two consecutive convolution operations typically used in convolutional neural networks (e. g., U-Net, ResNet).

3D Reconstruction Image Reconstruction

Transformer-based stereo-aware 3D object detection from binocular images

no code implementations24 Apr 2023 Hanqing Sun, Yanwei Pang, Jiale Cao, Jin Xie, Xuelong Li

In this paper, we explore the model design of Transformers in binocular 3D object detection, focusing particularly on extracting and encoding task-specific image correspondence information.

3D Object Detection Object +1

LEAPS: End-to-End One-Step Person Search With Learnable Proposals

no code implementations21 Mar 2023 Zhiqiang Dong, Jiale Cao, Rao Muhammad Anwer, Jin Xie, Fahad Khan, Yanwei Pang

Given a set of sparse and learnable proposals, LEAPS employs a dynamic person search head to directly perform person detection and corresponding re-id feature generation without non-maximum suppression post-processing.

Human Detection Person Search

USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval

1 code implementation17 Jan 2023 Yan Zhang, Zhong Ji, Di Wang, Yanwei Pang, Xuelong Li

(2) It limits the scale of negative sample pairs by employing the mini-batch based end-to-end training mechanism.

Contrastive Learning Image-text Retrieval +3

Memorizing Complementation Network for Few-Shot Class-Incremental Learning

no code implementations11 Aug 2022 Zhong Ji, Zhishen Hou, Xiyao Liu, Yanwei Pang, Xuelong Li

Few-shot Class-Incremental Learning (FSCIL) aims at learning new concepts continually with only a few samples, which is prone to suffer the catastrophic forgetting and overfitting problems.

class-incremental learning Few-Shot Class-Incremental Learning +3

Multi-scale Feature Aggregation for Crowd Counting

no code implementations10 Aug 2022 Xiaoheng Jiang, Xinyi Wu, Hisham Cholakkal, Rao Muhammad Anwer, Jiale Cao Mingliang Xu, Bing Zhou, Yanwei Pang, Fahad Shahbaz Khan

The SkipAgg module directly propagates features with small receptive fields to features with much larger receptive fields.

Crowd Counting

PSTR: End-to-End One-Step Person Search With Transformers

1 code implementation CVPR 2022 Jiale Cao, Yanwei Pang, Rao Muhammad Anwer, Hisham Cholakkal, Jin Xie, Mubarak Shah, Fahad Shahbaz Khan

We propose a novel one-step transformer-based person search framework, PSTR, that jointly performs person detection and re-identification (re-id) in a single architecture.

Decoder Human Detection +1

Dual-Domain Reconstruction Networks with V-Net and K-Net for fast MRI

no code implementations11 Mar 2022 Xiaohan Liu, Yanwei Pang, Ruiqi Jin, Yu Liu, ZhenChang Wang

Purpose: To introduce a dual-domain reconstruction network with V-Net and K-Net for accurate MR image reconstruction from undersampled k-space data.

Decoder Image Reconstruction

Active Phase-Encode Selection for Slice-Specific Fast MR Scanning Using a Transformer-Based Deep Reinforcement Learning Framework

no code implementations11 Mar 2022 Yiming Liu, Yanwei Pang, Ruiqi Jin, ZhenChang Wang

This paper aims to reducing the scan time by actively and sequentially selecting partial phases in a short time so that a slice can be accurately reconstructed from the resultant slice-specific incomplete K-space matrix.

Deep Reinforcement Learning Image Reconstruction +2

Self-Taught Cross-Domain Few-Shot Learning with Weakly Supervised Object Localization and Task-Decomposition

no code implementations3 Sep 2021 Xiyao Liu, Zhong Ji, Yanwei Pang, Zhongfei Zhang

However, the target domain is absolutely unknown during the training on the source domain, which results in lacking directed guidance for target tasks.

cross-domain few-shot learning Weakly-Supervised Object Localization

Information Symmetry Matters: A Modal-Alternating Propagation Network for Few-Shot Learning

no code implementations3 Sep 2021 Zhong Ji, Zhishen Hou, Xiyao Liu, Yanwei Pang, Jungong Han

Semantic information provides intra-class consistency and inter-class discriminability beyond visual concepts, which has been employed in Few-Shot Learning (FSL) to achieve further gains.

Attribute Few-Shot Learning

Shape Prior Non-Uniform Sampling Guided Real-time Stereo 3D Object Detection

no code implementations18 Jun 2021 Aqi Gao, Jiale Cao, Yanwei Pang

Compared with the baseline RTS3D, our proposed method has 2. 57% improvement on AP3d almost without extra network parameters.

3D Object Detection Object +1

From Handcrafted to Deep Features for Pedestrian Detection: A Survey

2 code implementations1 Oct 2020 Jiale Cao, Yanwei Pang, Jin Xie, Fahad Shahbaz Khan, Ling Shao

In addition to single-spectral pedestrian detection, we also review multi-spectral pedestrian detection, which provides more robust features for illumination variance.

Pedestrian Detection Survey

SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation

1 code implementation ECCV 2020 Jiale Cao, Rao Muhammad Anwer, Hisham Cholakkal, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao

In terms of real-time capabilities, SipMask outperforms YOLACT with an absolute gain of 3. 0% (mask AP) under similar settings, while operating at comparable speed on a Titan Xp.

object-detection Object Detection +4

Consensus-Aware Visual-Semantic Embedding for Image-Text Matching

1 code implementation ECCV 2020 Haoran Wang, Ying Zhang, Zhong Ji, Yanwei Pang, Lin Ma

In this paper, we propose a Consensus-aware Visual-Semantic Embedding (CVSE) model to incorporate the consensus information, namely the commonsense knowledge shared between both modalities, into image-text matching.

Image Captioning Image-text matching +2

BidNet: Binocular Image Dehazing Without Explicit Disparity Estimation

1 code implementation CVPR 2020 Yanwei Pang, Jing Nie, Jin Xie, Jungong Han, Xuelong Li

On the assumption that dehazed binocular images are superior to the hazy ones for stereo vision tasks such as 3D object detection and according to the fact that image haze is a function of depth, this paper proposes a Binocular image dehazing Network (BidNet) aiming at dehazing both the left and right images of binocular images within the deep learning framework.

3D Object Detection Disparity Estimation +2

Hierarchical Human Parsing with Typed Part-Relation Reasoning

1 code implementation CVPR 2020 Wenguan Wang, Hailong Zhu, Jifeng Dai, Yanwei Pang, Jianbing Shen, Ling Shao

As human bodies are underlying hierarchically structured, how to model human structures is the central theme in this task.

Human Parsing Relation

PSC-Net: Learning Part Spatial Co-occurrence for Occluded Pedestrian Detection

no code implementations25 Jan 2020 Jin Xie, Yanwei Pang, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Ling Shao

On the heavy occluded (\textbf{HO}) set of CityPerosns test set, our PSC-Net obtains an absolute gain of 4. 0\% in terms of log-average miss rate over the state-of-the-art with same backbone, input scale and without using additional VBB supervision.

Pedestrian Detection

Learning Compositional Neural Information Fusion for Human Parsing

1 code implementation ICCV 2019 Wenguan Wang, Zhijie Zhang, Siyuan Qi, Jianbing Shen, Yanwei Pang, Ling Shao

The bottom-up and top-down inferences explicitly model the compositional and decompositional relations in human bodies, respectively.

Human Parsing

NETNet: Neighbor Erasing and Transferring Network for Better Single Shot Object Detection

no code implementations CVPR 2020 Yazhao Li, Yanwei Pang, Jianbing Shen, Jiale Cao, Ling Shao

With this observation, we propose a new Neighbor Erasing and Transferring (NET) mechanism to reconfigure the pyramid features and explore scale-aware features.

Object object-detection +1

Mask-Guided Attention Network for Occluded Pedestrian Detection

1 code implementation ICCV 2019 Yanwei Pang, Jin Xie, Muhammad Haris Khan, Rao Muhammad Anwer, Fahad Shahbaz Khan, Ling Shao

Our approach obtains an absolute gain of 9. 5% in log-average miss rate, compared to the best reported results on the heavily occluded (HO) pedestrian set of CityPersons test set.

Pedestrian Detection

Towards Bridging Semantic Gap to Improve Semantic Segmentation

no code implementations ICCV 2019 Yanwei Pang, Yazhao Li, Jianbing Shen, Ling Shao

By embedding these two strategies, we construct a parallel feature pyramid towards improving multi-level feature fusion.

Segmentation Semantic Segmentation

A Semantics-Guided Class Imbalance Learning Model for Zero-Shot Classification

no code implementations26 Aug 2019 Zhong Ji, Xuejie Yu, Yunlong Yu, Yanwei Pang, Zhongfei Zhang

Towards alleviating the class imbalance issue in ZSC, we propose a sample-balanced training process to encourage all training classes to contribute equally to the learned model.

General Classification Image Classification +2

Saliency-Guided Attention Network for Image-Sentence Matching

no code implementations ICCV 2019 Zhong Ji, Haoran Wang, Jungong Han, Yanwei Pang

Concretely, the saliency detector provides the visual saliency information as the guidance for the two attention modules.


Stacked Semantics-Guided Attention Model for Fine-Grained Zero-Shot Learning

1 code implementation NeurIPS 2018 Yunlong Yu, Zhong Ji, Yanwei Fu, Jichang Guo, Yanwei Pang, Zhongfei (Mark) Zhang

Zero-Shot Learning (ZSL) is generally achieved via aligning the semantic relationships between the visual features and the corresponding class semantic descriptions.

General Classification Multi-class Classification +2

Bi-Adversarial Auto-Encoder for Zero-Shot Learning

no code implementations20 Nov 2018 Yunlong Yu, Zhong Ji, Yanwei Pang, Jichang Guo, Zhongfei Zhang, Fei Wu

Existing generative Zero-Shot Learning (ZSL) methods only consider the unidirectional alignment from the class semantics to the visual features while ignoring the alignment from the visual features to the class semantics, which fails to construct the visual-semantic interactions well.

Decoder Zero-Shot Learning

Triply Supervised Decoder Networks for Joint Detection and Segmentation

no code implementations CVPR 2019 Jiale Cao, Yanwei Pang, Xuelong. Li

Experimental results on the VOC2007 and VOC2012 datasets demonstrate that the proposed TripleNet is able to improve both the detection and segmentation accuracies without adding extra computational costs.

Decoder object-detection +4

Stacked Semantic-Guided Attention Model for Fine-Grained Zero-Shot Learning

no code implementations21 May 2018 Yunlong Yu, Zhong Ji, Yanwei Fu, Jichang Guo, Yanwei Pang, Zhongfei Zhang

To this end, we propose a novel stacked semantics-guided attention (S2GA) model to obtain semantic relevant features by using individual class semantic features to progressively guide the visual features to generate an attention map for weighting the importance of different local regions.

General Classification Multi-class Classification +2

Exploring Multi-Branch and High-Level Semantic Networks for Improving Pedestrian Detection

no code implementations3 Apr 2018 Jiale Cao, Yanwei Pang, Xuelong. Li

In this paper, we propose a multi-branch and high-level semantic network by gradually splitting a base network into multiple different branches.

object-detection Object Detection +1

A Cascaded Convolutional Neural Network for Single Image Dehazing

no code implementations21 Mar 2018 Chongyi Li, Jichang Guo, Fatih Porikli, Huazhu Fu, Yanwei Pang

Different from previous learning-based methods, we propose a flexible cascaded CNN for single hazy image restoration, which considers the medium transmission and global atmospheric light jointly by two task-driven subnetworks.

Image Dehazing Image Restoration +1

Attribute-Guided Network for Cross-Modal Zero-Shot Hashing

no code implementations6 Feb 2018 Zhong Ji, Yuxin Sun, Yunlong Yu, Yanwei Pang, Jungong Han

To address the Cross-Modal Zero-Shot Hashing (CMZSH) retrieval task, we propose a novel Attribute-Guided Network (AgNet), which can perform not only IBIR, but also Text-Based Image Retrieval (TBIR).

Attribute Cross-Modal Retrieval +3

Simultaneously Learning Neighborship and Projection Matrix for Supervised Dimensionality Reduction

no code implementations9 Sep 2017 Yanwei Pang, Bo Zhou, Feiping Nie

It is interesting that the optimal regularization parameter is adaptive to the neighbors in low-dimensional space and has intuitive meaning.

Supervised dimensionality reduction

Video Summarization with Attention-Based Encoder-Decoder Networks

no code implementations31 Aug 2017 Zhong Ji, Kailin Xiong, Yanwei Pang, Xuelong. Li

This paper addresses the problem of supervised video summarization by formulating it as a sequence-to-sequence learning problem, where the input is a sequence of original video frames, the output is a keyshot sequence.

Ranked #4 on Video Summarization on TvSum (using extra training data)

Decoder Supervised Video Summarization

Query-Aware Sparse Coding for Multi-Video Summarization

no code implementations13 Jul 2017 Zhong Ji, Yaru Ma, Yanwei Pang, Xuelong. Li

Given the explosive growth of online videos, it is becoming increasingly important to relieve the tedious work of browsing and managing the video content of interest.

Video Summarization

Semantic Softmax Loss for Zero-Shot Learning

no code implementations22 May 2017 Zhong Ji, Yunxin Sun, Yulong Yu, Jichang Guo, Yanwei Pang

However, the visual features and the class semantic descriptors locate in different structural spaces, a linear or bilinear model can not capture the semantic interactions between different modalities well.

Classification General Classification +3

Transductive Zero-Shot Learning with Adaptive Structural Embedding

no code implementations27 Mar 2017 Yunlong Yu, Zhong Ji, Jichang Guo, Yanwei Pang

Two fundamental challenges in it are visual-semantic embedding and domain adaptation in cross-modality learning and unseen class prediction steps, respectively.

Domain Adaptation Zero-Shot Learning

Zero-Shot Learning with Multi-Battery Factor Analysis

no code implementations30 Jun 2016 Zhong Ji, Yuzhong Xie, Yanwei Pang, Lei Chen, Zhongfei Zhang

Zero-shot learning (ZSL) extends the conventional image classification technique to a more challenging situation where the test image categories are not seen in the training samples.

Image Classification Zero-Shot Learning

Convolution in Convolution for Network in Network

no code implementations22 Mar 2016 Yanwei Pang, Manli Sun, Xiaoheng Jiang, Xuelong. Li

In this paper, we propose to replace dense shallow MLP with sparse shallow MLP.

Learning Multilayer Channel Features for Pedestrian Detection

no code implementations1 Mar 2016 Jiale Cao, Yanwei Pang, Xuelong. Li

For example, CNN classifies these proposals by the full-connected layer features while proposal scores and the features in the inner-layers of CNN are ignored.

Pedestrian Detection

Cascaded Subpatch Networks for Effective CNNs

no code implementations1 Mar 2016 Xiaoheng Jiang, Yanwei Pang, Manli Sun, Xuelong. Li

The first one is a linear filter of spatial size $ h\times w $ and is aimed at extracting features from spatial domain.

Moving Object Detection in Video Using Saliency Map and Subspace Learning

no code implementations30 Sep 2015 Yanwei Pang, Li Ye, Xuelong. Li, Jing Pan

So there are undesirable false alarms and missed alarms in many algorithms of moving object detection.

Moving Object Detection object-detection

Learning Sampling Distributions for Efficient Object Detection

no code implementations23 Aug 2015 Yanwei Pang, Jiale Cao, Xuelong. Li

Multistage particle windows (MPW), proposed by Gualdi et al., is an algorithm of fast and accurate object detection.

Face Detection Object +2

Cascade Learning by Optimally Partitioning

no code implementations18 Aug 2015 Yanwei Pang, Jiale Cao, Xuelong. Li

iCascade searches the optimal number ri of weak classifiers of each stage i by directly minimizing the computation cost of the cascade.

Face Detection object-detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.