1 code implementation • 27 Apr 2025 • Huiling Zheng, Xian Zhong, Bin Liu, Yi Xiao, Bihan Wen, Xiaofeng Li
The fusion of Synthetic Aperture Radar (SAR) and RGB imagery for land cover classification remains challenging due to modality heterogeneity and the underutilization of spectral complementarity.
no code implementations • 4 Apr 2025 • Quanxing Xu, Ling Zhou, Xian Zhong, Feifei Zhang, Rubing Huang, Chia-Wen Lin
Furthermore, to validate our concept of reducing output errors through filtering unrelated question-image inputs, we propose a specialized metric to evaluate the performance of the ISI module.
no code implementations • 31 Mar 2025 • Chenyang Li, Wenxuan Liu, Guoqiang Gong, Xiaobo Ding, Xian Zhong
Underwater object detection is critical for oceanic research and industrial safety inspections.
no code implementations • 26 Mar 2025 • Hanwen Liang, Xian Zhong, Wenxuan Liu, Yajing Zheng, Wenxin Huang, Zhaofei Yu, Tiejun Huang
Restoring clear frames from rainy videos presents a significant challenge due to the rapid motion of rain streaks.
no code implementations • 23 Mar 2025 • Fei Li, Wenxuan Liu, Jingjing Chen, Ruixu Zhang, Yuran Wang, Xian Zhong, Zheng Wang
Open Vocabulary Video Anomaly Detection (OVVAD) seeks to detect and classify both base and novel anomalies.
no code implementations • 4 Mar 2025 • Tianqing Zhang, Kairong Yu, Xian Zhong, Hongwei Wang, Qi Xu, Qiang Zhang
The framework demonstrates exceptional performance across diverse datasets and exhibits strong generalization capabilities.
1 code implementation • 25 Feb 2025 • Tianmi Ma, Jiawei Du, Wenxin Huang, Wenjie Wang, Liang Xie, Xian Zhong, Joey Tianyi Zhou
Recent advancements in large language models (LLMs) have significantly improved performance in natural language processing tasks.
no code implementations • 15 Feb 2025 • Huilin Zhu, Jingling Yuan, Zhengwei Yang, Yu Guo, Xian Zhong, Shengfeng He
In class-agnostic object counting, the goal is to estimate the total number of object instances in an image without distinguishing between specific categories.
no code implementations • 2 Dec 2024 • Xiyu Han, Xian Zhong, Wenxin Huang, Xuemei Jia, Wenxuan Liu, Xiaohan Yu, Alex ChiChung Kot
In this paper, we propose a novel prompt learning framework, Semantic Contextual Integration (SCI), for CC-ReID, which leverages the visual-text representation capabilities of CLIP to minimize the impact of clothing changes and enhance ID-relevant features.
1 code implementation • 24 Nov 2024 • Guanyu Zhou, Wenxuan Liu, Wenxin Huang, Xuemei Jia, Xian Zhong, Chia-Wen Lin
We anticipate that the challenges posed by OccludeNet will stimulate further exploration of causal relations in occlusion scenarios and encourage a reevaluation of class correlations, ultimately promoting sustainable performance improvements.
1 code implementation • 19 Sep 2024 • Xian Zhong, Shengwang Hu, Wenxuan Liu, Wenxin Huang, Jianhao Ding, Zhaofei Yu, Tiejun Huang
In this paper, we propose Hybrid Step-wise Distillation (HSD) method, tailored for neuromorphic datasets, to mitigate the notable decline in performance at lower time steps.
no code implementations • 7 Aug 2024 • Xian Zhong, Zohaib Salahuddin, Yi Chen, Henry C Woodruff, Haiyi Long, Jianyun Peng, Nuwan Udawatte, Roberto Casale, Ayoub Mokhtari, Xiaoer Zhang, Jiayao Huang, Qingyu Wu, Li Tan, Lili Chen, Dongming Li, Xiaoyan Xie, Manxia Lin, Philippe Lambin
This framework includes qualitative and quantitative assessments of explanations against recognized biomarkers, usability evaluations, and an in silico clinical trial.
no code implementations • 24 Jul 2024 • Yi Lei, Huilin Zhu, Jingling Yuan, Guangli Xiang, Xian Zhong, Shengfeng He
Drone-based crowd tracking faces difficulties in accurately identifying and monitoring objects from an aerial perspective, largely due to their small size and close proximity to each other, which complicates both localization and tracking.
1 code implementation • 6 Jul 2024 • Huilin Zhu, Jingling Yuan, Zhengwei Yang, Yu Guo, Zheng Wang, Xian Zhong, Shengfeng He
Zero-shot object counting (ZOC) aims to enumerate objects in images using only the names of object classes during testing, without the need for manual annotations.
1 code implementation • 10 Aug 2023 • Huilin Zhu, Jingling Yuan, Xian Zhong, Zhengwei Yang, Zheng Wang, Shengfeng He
Domain adaptation is commonly employed in crowd counting to bridge the domain gaps between different datasets.
1 code implementation • CVPR 2023 • Zhengwei Yang, Meng Lin, Xian Zhong, Yu Wu, Zheng Wang
Entangled representation of clothing and identity (ID)-intrinsic clues are potentially concomitant in conventional person Re-IDentification (ReID).
1 code implementation • 28 Nov 2022 • Xian Zhong, Zipeng Li, Shuqin Chen, Kui Jiang, Chen Chen, Mang Ye
In this paper, we introduce a novel Refined Semantic enhancement method towards Frequency Diffusion (RSFD), a captioning model that constantly perceives the linguistic representation of the infrequent tokens.
no code implementations • 16 Oct 2021 • Zhixin Sun, Xian Zhong, Shuqin Chen, Lin Li, Luo Zhong
Video captioning is a challenging task that captures different visual parts and describes them in sentences, for it requires visual and linguistic coherence.
1 code implementation • 21 Jul 2020 • Xian Zhong, Cheng Gu, Wenxin Huang, Lin Li, Shuqin Chen, Chia-Wen Lin
As a result, a meta-learner cannot be trained well in a high-dimensional parameter space to generalize to new tasks.
Ranked #18 on
Few-Shot Image Classification
on FC100 5-way (5-shot)
no code implementations • 4 Oct 2018 • Zhongwei Xie, Lin Li, Xian Zhong, Luo Zhong
In this paper, we propose an end-to-end neural network framework for image-to-video person reidentification by leveraging cross-modal embeddings learned from extra information. Concretely speaking, cross-modal embeddings from image captioning and video captioning models are reused to help learned features be projected into a coordinated space, where similarity can be directly computed.