1 code implementation • 12 Feb 2025 • Mingyu Xing, Lechao Cheng, Shenggeng Tang, Yaxiong Wang, Zhun Zhong, Meng Wang
We introduce \textbf{Knowledge Swapping}, a novel task designed to selectively regulate knowledge of a pretrained model by enabling the forgetting of user\-specified information, retaining essential knowledge, and acquiring new knowledge simultaneously.
1 code implementation • 11 Feb 2025 • Fangwen Wu, Lechao Cheng, Shengeng Tang, Xiaofeng Zhu, Chaowei Fang, Dingwen Zhang, Meng Wang
Building on this insight, we propose a novel semantic drift calibration method that incorporates mean shift compensation and covariance calibration.
no code implementations • 5 Feb 2025 • Jiayi He, Shengeng Tang, Ao Liu, Lechao Cheng, Jingjing Wu, Yanyan Wei
This paper presents the HFUT-LMC team's solution to the WWW 2025 challenge on Text-based Person Anomaly Search (TPAS).
1 code implementation • 17 Dec 2024 • Zhenxing Zhang, Yaxiong Wang, Lechao Cheng, Zhun Zhong, Dan Guo, Meng Wang
We present ASAP, a new framework for detecting and grounding multi-modal media manipulation (DGM4). Upon thorough examination, we observe that accurate fine-grained cross-modal semantic alignment between the image and text is vital for accurately manipulation detection and grounding.
no code implementations • 25 Nov 2024 • Shengeng Tang, Jiayi He, Lechao Cheng, Jingjing Wu, Dan Guo, Richang Hong
To address this, we propose a novel framework, Sign-D2C, that employs a conditional diffusion model to synthesize contextually smooth transition frames, enabling the seamless construction of continuous sign language sequences.
no code implementations • 24 Nov 2024 • Yuting Ma, Shengeng Tang, Xiaohua Xu, Lechao Cheng
Federated learning (FL) has emerged as a powerful approach to safeguard data privacy by training models across distributed edge devices without centralizing local data.
no code implementations • 21 Nov 2024 • Lei Jiang, Weizhe Huang, Tongxuan Liu, Yuting Zeng, Jing Li, Lechao Cheng, Xiaohua Xu
Large Vision-Language Models (LVLMs) represent a significant advancement toward achieving superior multimodal capabilities by enabling powerful Large Language Models (LLMs) to understand visual input.
1 code implementation • 18 Nov 2024 • Lechao Cheng, KaiFeng Chen, Jiyang Li, Shengeng Tang, Shufei Zhang, Meng Wang
Learning from noisy data has become essential for adapting deep learning models to real-world applications.
1 code implementation • 16 Nov 2024 • Ying Yang, De Cheng, Chaowei Fang, Yubiao Wang, Changzhe Jiao, Lechao Cheng, Nannan Wang
The innovation of our approach is that we leverage the diffusion model's intrinsic data reconstruction ability to distinguish ID samples from OOD samples in the latent feature space.
Out-of-Distribution Detection
Out of Distribution (OOD) Detection
no code implementations • 23 Oct 2024 • Yaxiong Wang, Lianwei Wu, Lechao Cheng, Zhun Zhong, Meng Wang
Recent advancements in image-text matching have been notable, yet prevailing models predominantly cater to broad queries and struggle with accommodating fine-grained query intention.
1 code implementation • 22 Oct 2024 • Xiaoyi Han, Nan Pu, Zunlei Feng, Yijun Bei, Qifei Zhang, Lechao Cheng, Liang Xue
The current irregularities in existing public Fire and Smoke Detection (FSD) datasets have become a bottleneck in the advancement of FSD technology.
1 code implementation • 22 Oct 2024 • Xiaoyi Han, Yanfei Wu, Nan Pu, Zunlei Feng, Qifei Zhang, Yijun Bei, Lechao Cheng
To address this issue, a new Attentive Fire and Smoke Detection Model (a-FSDM) is proposed.
no code implementations • 16 Oct 2024 • Mingce Guo, Jingxuan He, Shengeng Tang, Zhangye Wang, Lechao Cheng
Text-driven video editing utilizing generative diffusion models has garnered significant attention due to their potential applications.
1 code implementation • 31 Jul 2024 • Kuo Wang, Lechao Cheng, Weikai Chen, Pingping Zhang, Liang Lin, Fan Zhou, Guanbin Li
Learning from pseudo-labels that generated with VLMs~(Vision Language Models) has been shown as a promising solution to assist open vocabulary detection (OVD) in recent studies.
1 code implementation • 13 Jun 2024 • Yijun Bei, Hengrui Lou, Jinsong Geng, Erteng Liu, Lechao Cheng, Jie Song, Mingli Song, Zunlei Feng
Consequently, various face forgery detection techniques have been proposed to identify such fake facial content.
no code implementations • 27 May 2024 • Yuting Ma, Lechao Cheng, Yaxiong Wang, Zhun Zhong, Xiaohua Xu, Meng Wang
Specifically, we employ a local prompt tuning scheme that leverages a few learnable visual prompts to efficiently fine-tune the frozen pre-trained foundation model for downstream tasks, thereby accelerating training and improving model performance under limited local resources and data heterogeneity.
1 code implementation • 22 May 2024 • Dingwen Zhang, Hao Li, Diqi He, Nian Liu, Lechao Cheng, Jingdong Wang, Junwei Han
Experimental evaluations conducted on MS COCO, Cityscapes, and CTW1500 datasets indicate that the QEIS models' performance can be significantly improved when pre-trained with our method.
1 code implementation • 24 Apr 2024 • Yang Liu, Binglin Chen, Yongsen Zheng, Lechao Cheng, Guanbin Li, Liang Lin
Metro Origin-Destination (OD) prediction is a crucial yet challenging spatial-temporal prediction task in urban computing, which aims to accurately forecast cross-station ridership for optimizing metro scheduling and enhancing overall transport efficiency.
1 code implementation • 13 Apr 2024 • Jiyang Li, Lechao Cheng, Zhangye Wang, Tingting Mu, Jingxuan He
In this paper, inspired by significant progress in the field of novel view synthesis (NVS) achieved by 3D Gaussian Splatting (3D-GS), we propose LoopGaussian to elevate cinemagraph from 2D image space to 3D space using 3D Gaussian modeling.
1 code implementation • 9 Mar 2024 • Rui Yang, Haoran Liu, Edison Marrese-Taylor, Qingcheng Zeng, Yu He Ke, Wanxin Li, Lechao Cheng, Qingyu Chen, James Caverlee, Yutaka Matsuo, Irene Li
In this work, we develop an augmented LLM framework, KG-Rank, which leverages a medical knowledge graph (KG) along with ranking and re-ranking techniques, to improve the factuality of long-form question answering (QA) in the medical domain.
no code implementations • 4 Feb 2024 • Yuzhu Wang, Lechao Cheng, Chaowei Fang, Dingwen Zhang, Manni Duan, Meng Wang
Inspired by the observation that the prompt tokens tend to share high mutual information with patch tokens, we propose initializing prompts with downstream token prototypes.
Ranked #1 on
Visual Prompt Tuning
on VTAB-1k(Structured<8>)
1 code implementation • 25 Dec 2023 • Wentao Tian, Zheng Wang, Yuqian Fu, Jingjing Chen, Lechao Cheng
A comprehensive understanding of videos is inseparable from describing the action with its contextual action-object interactions.
1 code implementation • 14 Dec 2023 • Jingxuan He, Lechao Cheng, Chaowei Fang, Zunlei Feng, Tingting Mu, Mingli Song
Building upon this, we introduce a complementary self-enhancement method that constrains the semantic consistency between these confident regions and an augmented image with the same class labels.
Weakly supervised Semantic Segmentation
Weakly-Supervised Semantic Segmentation
no code implementations • 13 Dec 2023 • Yang Jiao, Zequn Jie, Shaoxiang Chen, Lechao Cheng, Jingjing Chen, Lin Ma, Yu-Gang Jiang
Camera-based bird-eye-view (BEV) perception paradigm has made significant progress in the autonomous driving field.
no code implementations • CVPR 2024 • Hao Li, Dingwen Zhang, Yalun Dai, Nian Liu, Lechao Cheng, Jingfeng Li, Jingdong Wang, Junwei Han
Applying NeRF to downstream perception tasks for scene understanding and representation is becoming increasingly popular.
no code implementations • 18 Nov 2023 • Haoran Li, Long Ma, Haolin Shi, Yanbin Hao, Yong Liao, Lechao Cheng, Pengyuan Zhou
First, we segment the objects and the background in a multi-object image.
1 code implementation • 4 Oct 2023 • Rui Yang, Edison Marrese-Taylor, Yuhe Ke, Lechao Cheng, Qingyu Chen, Irene Li
Our research demonstrates the effectiveness of using UMLS-augmented LLMs and highlights the potential application value of LLMs in in medical question-answering.
1 code implementation • 27 Sep 2023 • Linxin Song, Jieyu Zhang, Lechao Cheng, Pengyuan Zhou, Tianyi Zhou, Irene Li
Recent developments in large language models (LLMs) have shown promise in enhancing the capabilities of natural language processing (NLP).
no code implementations • 15 Jun 2023 • Wei zhang, Wong Kam-Kwai, Yitian Chen, Ailing Jia, Luwei Wang, Jian-Wei Zhang, Lechao Cheng, Huamin Qu, Wei Chen
The study of cultural artifact provenance, tracing ownership and preservation, holds significant importance in archaeology and art history.
1 code implementation • 26 May 2023 • Yuzhu Wang, Lechao Cheng, Manni Duan, Yongheng Wang, Zunlei Feng, Shu Kong
Finally, we propose a rather simple loss term (dubbed ND loss) to simultaneously (1) encourage student to produce large-\emph{norm} features, and (2) align the \emph{direction} of student features and teacher class-means.
Ranked #1 on
Knowledge Distillation
on COCO 2017 val
1 code implementation • 15 May 2023 • Fangwen Wu, Jingxuan He, Yufei Yin, Yanbin Hao, Gang Huang, Lechao Cheng
This study introduces an efficacious approach, Masked Collaborative Contrast (MCC), to highlight semantic regions in weakly supervised semantic segmentation.
Contrastive Learning
Weakly supervised Semantic Segmentation
+1
no code implementations • 4 May 2023 • Jingxuan He, Lechao Cheng, Chaowei Fang, Dingwen Zhang, Zhangye Wang, Wei Chen
A surge of interest has emerged in weakly supervised semantic segmentation due to its remarkable efficiency in recent years.
Weakly supervised Semantic Segmentation
Weakly-Supervised Semantic Segmentation
no code implementations • 11 Apr 2023 • Jiawei Chen, Lin Chen, Jiang Yang, Tianqi Shi, Lechao Cheng, Zunlei Feng, Mingli Song
In this study, we tackle the patch slimming problem from a different perspective by proposing a life regression module that determines the lifespan of each image patch in one go.
no code implementations • 10 Apr 2023 • Lin Chen, Zhijie Jia, Tian Qiu, Lechao Cheng, Jie Lei, Zunlei Feng, Mingli Song
In this work, we propose a new paradigm dubbed Decision Stream Calibration that boosts the performance of general Vision Transformers.
1 code implementation • 9 Apr 2023 • Wenxiang Xu, Yongcheng Jing, Linyun Zhou, Wenqi Huang, Lechao Cheng, Zunlei Feng, Mingli Song
This is specifically achieved by devising an elaborated ``prophetic'' teacher, termed as ``Propheter'', that aims to learn the potential class distributions.
1 code implementation • CVPR 2023 • Tianli Zhang, Mengqi Xue, Jiangtao Zhang, Haofei Zhang, Yu Wang, Lechao Cheng, Jie Song, Mingli Song
Most existing online knowledge distillation(OKD) techniques typically require sophisticated modules to produce diverse knowledge for improving students' generalization ability.
1 code implementation • 17 Feb 2023 • Zhijie Jia, Lin Chen, Kaiwen Hu, Lechao Cheng, Zunlei Feng, Mingli Song
Despite the remarkable progress in semantic segmentation tasks with the advancement of deep neural networks, existing U-shaped hierarchical typical segmentation networks still suffer from local misclassification of categories and inaccurate target boundaries.
1 code implementation • 14 Feb 2023 • Tian Qiu, Linyun Zhou, Wenxiang Xu, Lechao Cheng, Zunlei Feng, Mingli Song
Recent proposed DETR variants have made tremendous progress in various scenarios due to their streamlined processes and remarkable performance.
no code implementations • 3 Feb 2023 • Chaowei Fang, Dingwen Zhang, Wen Zheng, Xue Li, Le Yang, Lechao Cheng, Junwei Han
We set up novel evaluation benchmarks based on a series of testing sets with evolving distributions.
Ranked #66 on
Long-tail Learning
on CIFAR-100-LT (ρ=100)
no code implementations • CVPR 2023 • Hao Li, Dingwen Zhang, Nian Liu, Lechao Cheng, Yalun Dai, Chao Zhang, Xinggang Wang, Junwei Han
Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models by giving Saliency Prompt for queries/kernels.
no code implementations • 15 Dec 2022 • Zerun Liu, Fan Zhang, Jingxuan He, Jin Wang, Zhangye Wang, Lechao Cheng
In the realm of multi-modality, text-guided image retouching techniques emerged with the advent of deep learning.
1 code implementation • 5 Dec 2022 • Hui Su, Yue Ye, Wei Hua, Lechao Cheng, Mingli Song
In this work, we propose a simple yet effective sparse annotated semantic segmentation framework based on segformer, dubbed SASFormer, that achieves remarkable performance.
no code implementations • 2 Dec 2022 • Lechao Cheng, Chaowei Fang, Dingwen Zhang, Guanbin Li, Gang Huang
It can model the feature space more comprehensively and reduce the dominance of head classes.
no code implementations • 29 Nov 2022 • Huiyan Qi, Lechao Cheng, Jingjing Chen, Yue Yu, Xue Song, Zunlei Feng, Yu-Gang Jiang
Transfer learning aims to improve the performance of target tasks by transferring knowledge acquired in source tasks.
1 code implementation • 7 Sep 2022 • Haoling Li, Jie Song, Mengqi Xue, Haofei Zhang, Jingwen Ye, Lechao Cheng, Mingli Song
This survey aims to present a comprehensive review of NTs and attempts to identify how they enhance the model interpretability.
no code implementations • 1 Sep 2022 • Chaowei Fang, Lechao Cheng, Huiyan Qi, Dingwen Zhang
Most existing methods that cope with noisy labels usually assume that the class distributions are well balanced, which has insufficient capacity to deal with the practical scenarios where training samples have imbalanced distributions.
1 code implementation • 22 Aug 2022 • Mengqi Xue, Qihan Huang, Haofei Zhang, Lechao Cheng, Jie Song, Minghui Wu, Mingli Song
The global prototypes are adopted to provide the global view of objects to guide local prototypes to concentrate on the foreground while eliminating the influence of the background.
no code implementations • 10 Aug 2022 • Yingzi Fan, Longfei Han, Yue Zhang, Lechao Cheng, Chen Xia, Di Hu
The domain discrepancy induces to performance degradation on target testing data for CNN models.
1 code implementation • 3 Aug 2022 • Hui Su, Yue Ye, Zhiwei Chen, Mingli Song, Lechao Cheng
Weakly supervised object localization is a challenging task which aims to localize objects with coarse annotations such as image categories.
1 code implementation • 12 Jul 2022 • Hao Zhang, Lechao Cheng, Yanbin Hao, Chong-Wah Ngo
By replacing a vanilla 2D attention with the LAPS, we could adapt a static transformer into a video one, with zero extra parameters and neglectable computation overhead ($\sim$2. 6\%).
1 code implementation • 21 Jun 2022 • Xuanhan Wang, Jingkuan Song, Xiaojia Chen, Lechao Cheng, Lianli Gao, Heng Tao Shen
In this article, we propose a Knowledge Embedded RCNN (KE-RCNN) to identify attributes by leveraging rich knowledges, including implicit knowledge (e. g., the attribute ``above-the-hip'' for a shirt requires visual/geometry relations of shirt-hip) and explicit knowledge (e. g., the part of ``shorts'' cannot have the attribute of ``hoodie'' or ``lining'').
no code implementations • 29 Mar 2022 • Chaowei Fang, Dingwen Zhang, Liang Wang, Yulun Zhang, Lechao Cheng, Junwei Han
Improving the resolution of magnetic resonance (MR) image data is critical to computer-aided diagnosis and brain function analysis.
no code implementations • 17 Dec 2021 • Dingwen Zhang, Wenyuan Zeng, Guangyu Guo, Chaowei Fang, Lechao Cheng, Ming-Ming Cheng, Junwei Han
Current weakly supervised semantic segmentation (WSSS) frameworks usually contain the separated mask-refinement model and the main semantic region mining model.
Knowledge Distillation
Weakly supervised Semantic Segmentation
+1
1 code implementation • 1 Aug 2021 • Zunlei Feng, Lechao Cheng, Xinchao Wang, Xiang Wang, Yajie Liu, Xiangtong Du, Mingli Song
To this end, we propose a Translation Segmentation Network (Trans-Net), which comprises a segmentation network and two boundary discriminators.
1 code implementation • 1 Aug 2021 • Zunlei Feng, Zhonghua Wang, Xinchao Wang, Xiuming Zhang, Lechao Cheng, Jie Lei, Yuexuan Wang, Mingli Song
The diagnosis of MVI needs discovering the vessels that contain hepatocellular carcinoma cells and counting their number in each vessel, which depends heavily on experiences of the doctor, is largely subjective and time-consuming.
no code implementations • 1 Aug 2021 • Lechao Cheng, Zunlei Feng, Xinchao Wang, Ya Jie Liu, Jie Lei, Mingli Song
In this paper, we introduce a novel Reference semantic segmentation Network (Ref-Net) to conduct visual boundary knowledge translation.
no code implementations • CVPR 2018 • Lechao Cheng, Chengyi Zhang, Zicheng Liao
We introduce a new network structure for decomposing an image into its intrinsic albedo and shading.