no code implementations • 13 Apr 2025 • Xiang Hu, Pingping Zhang, Yuhao Wang, Bin Yan, Huchuan Lu
Furthermore, we propose the View-Refine Decoder (VRD) to obtain additional controllable conditions to generate missing cross-view features.
no code implementations • 31 Mar 2025 • Xiang Hu, Yuhao Wang, Pingping Zhang, Huchuan Lu
Then, with these features, we propose a Prompted Attribute Classifier Group (PACG) to generate person attribute predictions and obtain the encoded representations of predicted attributes.
no code implementations • 13 Mar 2025 • Yuhao Wang, Yongfeng Lv, Pingping Zhang, Huchuan Lu
Extensive experiments on three multi-modal object ReID benchmarks demonstrate the effectiveness of our proposed method.
1 code implementation • 15 Jan 2025 • Sitong Gong, Yunzhi Zhuge, Lu Zhang, Zongxin Yang, Pingping Zhang, Huchuan Lu
Existing methods for Video Reasoning Segmentation rely heavily on a single special token to represent the object in the keyframe or the entire video, inadequately capturing spatial complexity and inter-frame motion.
Ranked #1 on
Referring Video Object Segmentation
on ReVOS
1 code implementation • 14 Jan 2025 • Sitong Gong, Yunzhi Zhuge, Lu Zhang, Yifan Wang, Pingping Zhang, Lijun Wang, Huchuan Lu
To perform multi-modal fusion, we propose the Modality Aggregation Decoder, leveraging the Vision-to-Audio Fusion Block to integrate visual features into audio features across both frame and temporal levels.
1 code implementation • 27 Dec 2024 • Chengyang Ye, Yunzhi Zhuge, Pingping Zhang
In this work, we introduce Open-Vocabulary Remote Sensing Image Semantic Segmentation (OVRSISS), which aims to segment arbitrary semantic classes in remote sensing images.
1 code implementation • 23 Dec 2024 • Yuhao Wang, Pingping Zhang, Xuehu Liu, Zhengzheng Tu, Huchuan Lu
We propose a novel fusion framework called FusionReID to unify the strengths of CNNs and Transformers for image-based person ReID.
1 code implementation • 14 Dec 2024 • Yuhao Wang, Xuehu Liu, Tianyu Yan, Yang Liu, Aihua Zheng, Pingping Zhang, Huchuan Lu
Furthermore, current multi-modal aggregation methods have obvious limitations in dealing with long sequences from different modalities.
1 code implementation • 14 Dec 2024 • Yuhao Wang, Yang Liu, Aihua Zheng, Pingping Zhang
To address these issues, we propose a novel feature learning framework called DeMo for multi-modal object ReID, which adaptively balances decoupled features using a mixture of experts.
no code implementations • 19 Nov 2024 • Kecheng Chen, Pingping Zhang, Hui Liu, Jie Liu, Yibing Liu, Jiaxin Huang, Shiqi Wang, Hong Yan, Haoliang Li
We have recently witnessed that ``Intelligence" and `` Compression" are the two sides of the same coin, where the language large model (LLM) with unprecedented intelligence is a general-purpose lossless compressor for various data modalities.
no code implementations • 16 Oct 2024 • Kecheng Chen, Pingping Zhang, Tiexin Qin, Shiqi Wang, Hong Yan, Haoliang Li
Current test- or compression-time adaptation image compression (TTA-IC) approaches, which leverage both latent and decoder refinements as a two-step adaptation scheme, have potentially enhanced the rate-distortion (R-D) performance of learned image compression models on cross-domain compression tasks, \textit{e. g.,} from natural to screen content images.
no code implementations • 15 Aug 2024 • Pingping Zhang, Jinlong Li, Kecheng Chen, Meng Wang, Long Xu, Haoliang Li, Nicu Sebe, Sam Kwong, Shiqi Wang
Existing codecs are designed to eliminate intrinsic redundancies to create a compact representation for compression.
1 code implementation • 8 Aug 2024 • Shixuan Gao, Pingping Zhang, Tianyu Yan, Huchuan Lu
Finally, we propose a Detail Enhancement Module (DEM) to incorporate SAM with fine-grained details.
1 code implementation • 31 Jul 2024 • Kuo Wang, Lechao Cheng, Weikai Chen, Pingping Zhang, Liang Lin, Fan Zhou, Guanbin Li
Learning from pseudo-labels that generated with VLMs~(Vision Language Models) has been shown as a promising solution to assist open vocabulary detection (OVD) in recent studies.
1 code implementation • 9 Jul 2024 • Xiangyu Liao, Tianheng Zheng, Jiayu Zhong, Pingping Zhang, Chao Ren
In recent years, self-supervised denoising methods have gained significant success and become critically important in the field of image restoration.
1 code implementation • 24 Apr 2024 • Tianyu Yan, Zifu Wan, Xinhao Deng, Pingping Zhang, Yang Liu, Huchuan Lu
In underwater scenes, it exhibits substantial performance degradation due to the light scattering and absorption.
Ranked #1 on
Image Segmentation
on RMAS
no code implementations • 23 Apr 2024 • Yingquan Wang, Pingping Zhang, Dong Wang, Huchuan Lu
In this work, we first explore the influence of global and local features of ViT and then further propose a novel Global-Local Transformer (GLTrans) for high-performance object Re-ID.
no code implementations • 15 Apr 2024 • Xiangrui Liu, Xinju Wu, Pingping Zhang, Shiqi Wang, Zhu Li, Sam Kwong
Gaussian splatting, renowned for its exceptional rendering quality and efficiency, has emerged as a prominent technique in 3D scene representation.
1 code implementation • CVPR 2024 • Pingping Zhang, Tianyu Yan, Yang Liu, Huchuan Lu
To this end, we first introduce a dual structure with SAM's paradigm to enhance feature learning of marine images.
1 code implementation • 5 Apr 2024 • Zifu Wan, Pingping Zhang, Yuhao Wang, Silong Yong, Simon Stepputtis, Katia Sycara, Yaqi Xie
Multi-modal semantic segmentation significantly enhances AI agents' perception and scene understanding, especially under adverse conditions like low-light or overexposed environments.
Ranked #4 on
Thermal Image Segmentation
on PST900
2 code implementations • CVPR 2024 • Pingping Zhang, Yuhao Wang, Yang Liu, Zhengzheng Tu, Huchuan Lu
To address above issues, we propose a novel learning framework named \textbf{EDITOR} to select diverse tokens from vision Transformers for multi-modal object ReID.
1 code implementation • 15 Dec 2023 • Chenyang Yu, Xuehu Liu, Yingquan Wang, Pingping Zhang, Huchuan Lu
Technically, TMC allows the frame-level memories in a sequence to communicate with each other, and to extract temporal information based on the relations within the sequence.
1 code implementation • 15 Dec 2023 • Yuhao Wang, Xuehu Liu, Pingping Zhang, Hu Lu, Zhengzheng Tu, Huchuan Lu
In addition, most of current Transformer-based ReID methods only utilize the global feature of class tokens to achieve the holistic retrieval, ignoring the local discriminative ones.
1 code implementation • 15 Dec 2023 • Shang Gao, Chenyang Yu, Pingping Zhang, Huchuan Lu
In addition, existing occluded person ReID benchmarks utilize occluded samples as queries, which will amplify the role of alleviating occlusion interference and underestimate the impact of the feature absence issue.
1 code implementation • 4 Dec 2023 • Bingkun Nian, Fenghe Tang, Jianrui Ding, Pingping Zhang, Jie Yang, S. Kevin Zhou, Wei Liu
In this paper, we present a high-performance deep neural network for weak target image segmentation, including medical image segmentation and infrared image segmentation.
1 code implementation • 22 Oct 2023 • Tianyu Yan, Zifu Wan, Pingping Zhang, Gong Cheng, Huchuan Lu
To relieve these issues, in this work we propose a novel Transformer-based learning framework named TransY-Net for remote sensing image CD, which improves the feature extraction from a global view and combines multi-level visual features in a pyramid manner.
1 code implementation • 7 Aug 2023 • Xinhao Deng, Pingping Zhang, Wei Liu, Huchuan Lu
To address above issues, in this work, we first propose a new HRS10K dataset, which contains 10, 500 high-quality annotated images at 2K-8K resolution.
no code implementations • 7 Aug 2023 • Xuehu Liu, Pingping Zhang, Huchuan Lu
Meanwhile, to extract short-term representations, we propose a Bi-direction Motion Estimator (BME), in which reciprocal motion information is efficiently extracted from consecutive frames.
Representation Learning
Video-Based Person Re-Identification
no code implementations • 2 May 2023 • Xinju Wu, Pingping Zhang, Meng Wang, Peilin Chen, Shiqi Wang, Sam Kwong
The emergence of digital avatars has raised an exponential increase in the demand for human point clouds with realistic and intricate details.
1 code implementation • 27 Apr 2023 • Xuehu Liu, Chenyang Yu, Pingping Zhang, Huchuan Lu
Further, in spatial, we propose a Complementary Content Attention (CCA) to take advantages of the coupled structure and guide independent features for spatial complementary learning.
1 code implementation • 1 Dec 2022 • Hu Lu, Xuezhang Zou, Pingping Zhang
Visible-Infrared Person Re-Identification (VI-ReID) is a challenging retrieval task under complex modality changes.
1 code implementation • 3 Oct 2022 • Tianyu Yan, Zifu Wan, Pingping Zhang
Then, we introduce a pyramid structure to aggregate multi-level visual features from Transformers for feature enhancement.
no code implementations • 20 Feb 2022 • Pingping Zhang, Xu Wang, Linwei Zhu, Yun Zhang, Shiqi Wang, Sam Kwong
In this paper, we propose a distortion-aware loop filtering model to improve the performance of intra coding for 360$^o$ videos projected via equirectangular projection (ERP) format.
1 code implementation • CVPR 2022 • Wenhui Wu, Jian Weng, Pingping Zhang, Xu Wang, Wenhan Yang, Jianmin Jiang
Retinex model-based methods have shown to be effective in layer-wise manipulation with well-designed priors for low-light image enhancement.
no code implementations • 3 Dec 2021 • Lianjie Jia, Chenyang Yu, Xiehao Ye, Tianyu Yan, Yinjie Lei, Pingping Zhang
To generate high-quality pseudo-labels and mitigate the impact of clustering errors, we propose a novel clustering relationship modeling framework for unsupervised person Re-ID.
no code implementations • 5 Aug 2021 • Duo Peng, Yinjie Lei, Lingqiao Liu, Pingping Zhang, Jun Liu
In this work, we propose two simple yet effective texture randomization mechanisms, Global Texture Randomization (GTR) and Local Texture Randomization (LTR), for Domain Generalization based SRSS.
1 code implementation • ICCV 2021 • Duo Peng, Yinjie Lei, Wen Li, Pingping Zhang, Yulan Guo
Domain adaptation is critical for success when confronting with the lack of annotations in a new domain.
1 code implementation • 15 Jul 2021 • Wei Liu, Pingping Zhang, Yinjie Lei, Xiaolin Huang, Jie Yang, Michael Ng
The effectiveness and superior performance of our approach are validated through comprehensive experiments in a range of applications.
1 code implementation • 13 Jul 2021 • Guowen Zhang, Pingping Zhang, Jinqing Qi, Huchuan Lu
In this work, we take advantages of both CNNs and Transformers, and propose a novel learning framework named Hierarchical Aggregation Transformer (HAT) for image-based person Re-ID with high performance.
no code implementations • 5 Apr 2021 • Xuehu Liu, Pingping Zhang, Chenyang Yu, Huchuan Lu, Xuesheng Qian, Xiaoyun Yang
To capture richer perceptions and extract more comprehensive video representations, in this paper we propose a novel framework named Trigeminal Transformers (TMT) for video-based person Re-ID.
1 code implementation • CVPR 2021 • Xuehu Liu, Pingping Zhang, Chenyang Yu, Huchuan Lu, Xiaoyun Yang
Specifically, we first propose a Global-guided Correlation Estimation (GCE) to generate feature correlation maps of local features and global features, which help to localize the high- and low-correlation regions for identifying the same person.
1 code implementation • ICCV 2021 • Yingquan Wang, Pingping Zhang, Shang Gao, Xia Geng, Hu Lu, Dong Wang
Video-based person re-identification aims to associate the video clips of the same person across multiple non-overlapping cameras.
Ranked #1 on
Person Re-Identification
on DukeMTMC-VideoReID
no code implementations • 19 Oct 2020 • Yinjie Lei, Duo Peng, Pingping Zhang, Qiuhong Ke, Haifeng Li
Based on the MPFL strategy, our framework achieves a novel approach to adapt to the scale and location diversities of the scene change regions.
no code implementations • ECCV 2020 • Yan Liu, Lingqiao Liu, Peng Wang, Pingping Zhang, Yinjie Lei
Most existing crowd counting systems rely on the availability of the object location annotation which can be expensive to obtain.
no code implementations • 29 Feb 2020 • Yinjie Lei, Yan Liu, Pingping Zhang, Lingqiao Liu
Most existing crowd counting methods require object location-level annotation, i. e., placing a dot at the center of an object.
1 code implementation • 24 Feb 2020 • Runmin Wu, Kunyao Zhang, Lijun Wang, Yue Wang, Pingping Zhang, Huchuan Lu, Yizhou Yu
Though recent research has achieved remarkable progress in generating realistic images with generative adversarial networks (GANs), the lack of training stability is still a lingering concern of most GANs, especially on high-resolution inputs and complex datasets.
no code implementations • 15 Nov 2019 • Yanjie Gou, Yinjie Lei, Lingqiao Liu, Pingping Zhang, Xi Peng
To account for this style shift, the model should adjust its parameters in accordance with entity types.
no code implementations • 8 Oct 2019 • Pingping Zhang, Wei Liu, Yinjie Lei, Hongyu Wang, Huchuan Lu
The proposed method consists of three modules, i. e., recurrent FCNs, adaptive multiphase level set, and deeply supervised learning.
1 code implementation • ICCV 2019 • Yi Zeng, Pingping Zhang, Jianming Zhang, Zhe Lin, Huchuan Lu
This paper pushes forward high-resolution saliency detection, and contributes a new dataset, named High-Resolution Salient Object Detection (HRSOD).
Ranked #12 on
RGB Salient Object Detection
on DAVIS-S
(using extra training data)
no code implementations • ICCV 2019 • Pingping Zhang, Wei Liu, Yinjie Lei, Huchuan Lu, Xiaoyun Yang
To address these issues, in this work we propose a novel deep learning framework, named Cascaded Context Pyramid Network (CCPNet), to jointly infer the occupancy and semantic labels of a volumetric 3D scene from a single depth image.
Ranked #6 on
3D Semantic Scene Completion
on NYUv2
(using extra training data)
1 code implementation • 23 Jul 2019 • Wei Liu, Pingping Zhang, Yinjie Lei, Xiaolin Huang, Jie Yang, Ian Reid
In this paper, a non-convex non-smooth optimization framework is proposed to achieve diverse smoothing natures where even contradictive smoothing behaviors can be achieved.
no code implementations • 1 Mar 2019 • Yinjie Lei, Ziqin Zhou, Pingping Zhang, Yulan Guo, Zijun Ma, Lingqiao Liu
A sketch based 3D shape retrieval
no code implementations • 21 Jan 2019 • Pingping Zhang, Wei Liu, Huchuan Lu, Chunhua Shen
Inspired by the intrinsic reflection of natural images, in this paper we propose a novel feature learning framework for large-scale salient object detection.
no code implementations • 28 Sep 2018 • Yunzhi Zhuge, Pingping Zhang, Huchuan Lu
Fully convolutional networks (FCN) has significantly improved the performance of many pixel-labeling tasks, such as semantic segmentation and depth estimation.
no code implementations • 4 Aug 2018 • Pingping Zhang, Huchuan Lu, Chunhua Shen
In addition, our work has text overlap with arXiv:1804. 06242, arXiv:1705. 00938 by other authors.
no code implementations • 14 Apr 2018 • Pingping Zhang, Huchuan Lu, Chunhua Shen
Salient object detection (SOD), which aims to find the most important region of interest and segment the relevant object/item in that area, is an important yet challenging vision task.
no code implementations • 22 Feb 2018 • Pingping Zhang, Wei Liu, Dong Wang, Yinjie Lei, Hongyu Wang, Chunhua Shen, Huchuan Lu
Extensive experiments demonstrate that the proposed algorithm achieves competitive performance in both saliency detection and visual tracking, especially outperforming other related trackers on the non-rigid object tracking datasets.
no code implementations • 22 Feb 2018 • Ju Dai, Pingping Zhang, Huchuan Lu, Hongyu Wang
In this paper, we propose a novel feature learning framework for video person re-identification (re-ID).
no code implementations • 20 Feb 2018 • Pingping Zhang, Luyao Wang, Dong Wang, Huchuan Lu, Chunhua Shen
This paper proposes an Agile Aggregating Multi-Level feaTure framework (Agile Amulet) for salient object detection.
no code implementations • 20 Feb 2018 • Fei Li, Pingping Zhang, Huchuan Lu
Band selection is a direct and effective method to remove redundant information and reduce the spectral dimension for decreasing computational complexity and avoiding the curse of dimensionality.
no code implementations • 19 Feb 2018 • Pingping Zhang, Wei Liu, Huchuan Lu, Chunhua Shen
Inspired by the intrinsic reflection of natural images, in this paper we propose a novel feature learning framework for large-scale salient object detection.
no code implementations • 30 Jan 2018 • Jie Yang, Pingping Zhang, Yan Liu
The numerical results show that there is not significant reduction in the classification ability of the network if the input signals are subject to sinusoidal and Gaussian perturbations.
1 code implementation • ICCV 2017 • Tiantian Wang, Ali Borji, Lihe Zhang, Pingping Zhang, Huchuan Lu
To remedy this problem, here we propose to augment feedforward neural networks with a novel pyramid pooling module and a multi-stage refinement mechanism for saliency detection.
Ranked #15 on
RGB Salient Object Detection
on DUTS-TE
(max F-measure metric)
1 code implementation • ICCV 2017 • Pingping Zhang, Dong Wang, Huchuan Lu, Hongyu Wang, Xiang Ruan
In addition, to achieve accurate boundary inference and semantic enhancement, edge-aware feature maps in low-level layers and the predicted results of low resolution features are recursively embedded into the learning framework.
Ranked #21 on
RGB Salient Object Detection
on DUTS-TE
(max F-measure metric)
1 code implementation • ICCV 2017 • Pingping Zhang, Dong Wang, Huchuan Lu, Hongyu Wang, Bao-Cai Yin
In this paper, we propose a novel deep fully convolutional network model for accurate salient object detection.
Ranked #5 on
Saliency Detection
on DUT-OMRON