no code implementations • 12 Jun 2025 • Yu Huang, Zelin Peng, Yichen Zhao, Piao Yang, Xiaokang Yang, Wei Shen
In this paper, we introduce medical image reasoning segmentation, a novel task that aims to generate segmentation masks based on complex and implicit medical instructions.
no code implementations • CVPR 2025 • Feilong Tang, Chengzhi Liu, Zhongxing Xu, Ming Hu, Zelin Peng, Zhiwei Yang, Jionglong Su, Minquan Lin, Yifan Peng, Xuelian Cheng, Imran Razzak, ZongYuan Ge
With this goal, we present FarSight, a versatile plug-and-play decoding strategy to reduce attention interference from outlier tokens merely by optimizing the causal mask.
no code implementations • 16 Apr 2025 • Guanchun Wang, Xiangrong Zhang, Yifei Zhang, Zelin Peng, Tianyang Zhang, Xu Tang, Licheng Jiao
Unsupervised anomaly detection in hyperspectral images (HSI), aiming to detect unknown targets from backgrounds, is challenging for earth surface monitoring.
1 code implementation • 23 Mar 2025 • Yiheng Zhong, Zihong Luo, Chengzhi Liu, Feilong Tang, Zelin Peng, Ming Hu, Yingzhen Hu, Jionglong Su, ZongYuan Ge, Imran Razzak
To address this, we propose Prior-Guided SAM (PG-SAM), which employs a fine-grained modality prior aligner to leverage specialized medical knowledge for better modality alignment.
no code implementations • 20 Mar 2025 • Haolin Yang, Feilong Tang, Ming Hu, Yulong Li, Yexin Liu, Zelin Peng, Junjun He, ZongYuan Ge, Imran Razzak
Specifically, we perform one-step denoising to convert initial noises into a clip and subsequently evaluate its long-term value, leveraging a reward model anchored by previously generated content.
no code implementations • CVPR 2025 • Changsong Wen, Zelin Peng, Yu Huang, Xiaokang Yang, Wei Shen
The text prompts guide DG model learning in three aspects: feature suppression, which uses these prompts to identify domain-sensitive features and suppress them; feature consistency, which ensures the model's features are robust to domain variations imitated by the diverse prompts; and feature diversification, which diversifies features based on the prompts to mitigate bias.
no code implementations • CVPR 2025 • Zelin Peng, Zhengqin Xu, Zhilin Zeng, Changsong Wen, Yu Huang, Menglin Yang, Feilong Tang, Wei Shen
In this work, we explain this phenomenon from the perspective of hierarchical alignment, since during fine-tuning, the hierarchy level of image embeddings shifts from image-level to pixel-level.
Open Vocabulary Semantic Segmentation
Open-Vocabulary Semantic Segmentation
+1
no code implementations • CVPR 2025 • Zelin Peng, Yu Huang, Zhengqin Xu, Feilong Tang, Ming Hu, Xiaokang Yang, Wei Shen
Contextual modeling is crucial for robust visual representation learning, especially in computer vision.
no code implementations • CVPR 2025 • Yuheng Feng, Changsong Wen, Zelin Peng, Li jiaye, Siyu Zhu
Contrastive language-image pretraining models such as CLIP have demonstrated remarkable performance in various text-image alignment tasks.
no code implementations • 23 Nov 2024 • Ming Hu, Kun Yuan, Yaling Shen, Feilong Tang, Xiaohao Xu, Lin Zhou, Wei Li, Ying Chen, Zhongxing Xu, Zelin Peng, Siyuan Yan, Vinkle Srivastav, Diping Song, Tianbin Li, Danli Shi, Jin Ye, Nicolas Padoy, Nassir Navab, Junjun He, ZongYuan Ge
Surgical practice involves complex visual interpretation, procedural skills, and advanced medical knowledge, making surgical vision-language pretraining (VLP) particularly challenging due to this complexity and the limited availability of annotated data.
no code implementations • CVPR 2025 • Zelin Peng, Zhengqin Xu, Zhilin Zeng, Yaoming Wang, Wei Shen
Since the PEFT strategy is conducted symmetrically to the two CLIP modalities, the misalignment between them is mitigated.
Open Vocabulary Semantic Segmentation
Open-Vocabulary Semantic Segmentation
+2
1 code implementation • 28 Apr 2024 • Guanchun Wang, Xiangrong Zhang, Zelin Peng, Tianyang Zhang, Licheng Jiao
In S$^2$Mamba, two selective structured state space models through different dimensions are designed for feature extraction, one for spatial, and the other for spectral, along with a spatial-spectral mixture gate for optimal fusion.
no code implementations • CVPR 2024 • Zelin Peng, Zhengqin Xu, Zhilin Zeng, Lingxi Xie, Qi Tian, Wei Shen
Parameter-efficient fine-tuning (PEFT) is an effective methodology to unleash the potential of large foundation models in novel scenarios with limited training data.
no code implementations • 28 Aug 2023 • Zelin Peng, Zhengqin Xu, Zhilin Zeng, Xiaokang Yang, Wei Shen
Most existing fine-tuning methods attempt to bridge the gaps among different scenarios by introducing a set of new parameters to modify SAM's original parameter space.
no code implementations • ICCV 2023 • Zelin Peng, Guanchun Wang, Lingxi Xie, Dongsheng Jiang, Wei Shen, Qi Tian
Seed area generation is usually the starting point of weakly supervised semantic segmentation (WSSS).
no code implementations • 4 Jul 2022 • Wei Shen, Zelin Peng, Xuehui Wang, Huayu Wang, Jiazhong Cen, Dongsheng Jiang, Lingxi Xie, Xiaokang Yang, Qi Tian
Next, we summarize the existing label-efficient image segmentation methods from a unified perspective that discusses an important question: how to bridge the gap between weak supervision and dense prediction -- the current methods are mostly based on heuristic priors, such as cross-pixel similarity, cross-label constraint, cross-view consistency, and cross-image relation.
no code implementations • 21 Apr 2022 • Guanchun Wang, Xiangrong Zhang, Zelin Peng, Xu Tang, Huiyu Zhou, Licheng Jiao
In the exploiting stage, we utilize the extracted NDI to construct a novel negative contrastive learning mechanism and a negative guided instance selection strategy for dealing with the issues of part domination and missing instances, respectively.
no code implementations • 3 Aug 2021 • Xiangrong Zhang, Zelin Peng, Peng Zhu, Tianyang Zhang, Chen Li, Huiyu Zhou, Licheng Jiao
Semantic segmentation has been continuously investigated in the last ten years, and majority of the established technologies are based on supervised models.