no code implementations • 30 Dec 2024 • Yuhe Ding, Bo Jiang, Aihua Zheng, Qin Xu, Jian Liang
VEGA is motivated by the pretraining paradigm of VLMs, which aligns features with the same semantics from the visual and textual modalities, thereby mapping both modalities into a shared representation space.
1 code implementation • 14 Dec 2024 • Yuhao Wang, Xuehu Liu, Tianyu Yan, Yang Liu, Aihua Zheng, Pingping Zhang, Huchuan Lu
Furthermore, current multi-modal aggregation methods have obvious limitations in dealing with long sequences from different modalities.
1 code implementation • 14 Dec 2024 • Yuhao Wang, Yang Liu, Aihua Zheng, Pingping Zhang
To address these issues, we propose a novel feature learning framework called DeMo for multi-modal object ReID, which adaptively balances decoupled features using a mixture of experts.
no code implementations • 18 Sep 2024 • Shuo Lu, YingSheng Wang, Lijun Sheng, Aihua Zheng, Lingxiao He, Jian Liang
Out-of-distribution (OOD) detection aims to detect test samples outside the training category space, which is an essential component in building reliable machine learning systems.
Out-of-Distribution Detection
Out of Distribution (OOD) Detection
+2
no code implementations • 23 Feb 2024 • Yuhe Ding, Bo Jiang, Aijing Yu, Aihua Zheng, Jian Liang
In this survey, we present the first review of existing advances in this area and categorize them into two separate realms: source-free model transferability estimation and source-dependent model transferability estimation.
1 code implementation • CVPR 2024 • Hongchao Li, Jingong Chen, Aihua Zheng, Yong Wu, Yonglong Luo
In this work we address the Day-Night cross-domain ReID (DN-ReID) problem and provide a new cross-domain dataset named DN-Wild including day and night images of 2286 identities giving in total 85945 daytime images and 54952 nighttime images.
no code implementations • 9 Oct 2023 • Yuhe Ding, Bo Jiang, Lijun Sheng, Aihua Zheng, Jian Liang
Transferability estimation aims to provide heuristics for quantifying how suitable a pre-trained model is for a specific downstream task, without fine-tuning them all.
no code implementations • 25 May 2023 • Aihua Zheng, Chaobin Zhang, Weijun Zhang, Chenglong Li, Jin Tang, Chang Tan, Ruoran Jia
Existing vehicle re-identification methods mainly rely on the single query, which has limited information for vehicle representation and thus significantly hinders the performance of vehicle Re-ID in complicated surveillance networks.
no code implementations • 25 May 2023 • Aihua Zheng, Ziling He, Zi Wang, Chenglong Li, Jin Tang
Many existing multi-modality studies are based on the assumption of modality integrity.
1 code implementation • 23 May 2023 • Aihua Zheng, Zhiqi Ma, Zi Wang, Chenglong Li
Finally, to evaluate the proposed FACENet in handling intense flare, we introduce a new multi-spectral vehicle re-ID dataset, called WMVEID863, with additional challenges such as motion blur, significant background changes, and particularly intense flare degradation.
1 code implementation • 17 Mar 2023 • Yuhe Ding, Jian Liang, Jie Cao, Aihua Zheng, Ran He
Briefly, MODIFY first trains a generative model in the target domain and then translates a source input to the target domain via the provided style model.
1 code implementation • 9 Feb 2023 • Yuhe Ding, Jian Liang, Bo Jiang, Aihua Zheng, Ran He
Existing cross-domain keypoint detection methods always require accessing the source data during adaptation, which may violate the data privacy law and pose serious security concerns.
1 code implementation • 11 Oct 2022 • Zi Wang, Huaibo Huang, Aihua Zheng, Chenglong Li, Ran He
To alleviate these two issues, we propose a simple yet effective method with Parallel Augmentation and Dual Enhancement (PADE), which is robust on both occluded and non-occluded data and does not require any auxiliary clues.
1 code implementation • 1 Aug 2022 • Aihua Zheng, Xianpeng Zhu, Zhiqi Ma, Chenglong Li, Jin Tang, Jixin Ma
In particular, we design a new cross-directional center loss to pull the modality centers of each identity close to mitigate cross-modality discrepancy, while the sample centers of each identity close to alleviate the sample discrepancy.
no code implementations • 2 Jun 2022 • Chenglong Li, Xiaobin Yang, Guohao Wang, Aihua Zheng, Chang Tan, Ruoran Jia, Jin Tang
License plate recognition plays a critical role in many practical applications, but license plates of large vehicles are difficult to be recognized due to the factors of low resolution, contamination, low illumination, and occlusion, to name a few.
2 code implementations • 29 May 2022 • Yuhe Ding, Lijun Sheng, Jian Liang, Aihua Zheng, Ran He
First of all, to avoid additional parameters and explore the information in the source model, ProxyMix defines the weights of the classifier as the class prototypes and then constructs a class-balanced proxy source domain by the nearest neighbors of the prototypes to bridge the unseen source domain and the target domain.
1 code implementation • IEEE Transactions on Multimedia 2021 • Aihua Zheng, Menglan Hu, Bo Jiang *, Yan Huang, Yan Yan, and Bin Luo
AML aims to generate a modality-independent representation for each person in each modality via adversarial learning, while simultaneously learns a robust similarity measure for cross-modality matching via metric learning.
no code implementations • 18 Nov 2020 • Aihua Zheng, Xia Sun, Chenglong Li, Jin Tang
Comprehensive experiments against the state-of-the-art methods on two multi-viewpoint benchmark datasets VeRi and VeRi-Wild validate the promising performance of the proposed method in both with and without domain adaption scenarios while handling unsupervised vehicle Re-ID.
no code implementations • 10 Nov 2020 • Yuhe Ding, Xin Ma, Mandi Luo, Aihua Zheng, Ran He
Considering the intuitive artifacts in the existing methods, we propose a contrastive style loss for style rendering to enforce the similarity between the style of rendered photo and the caricature, and simultaneously enhance its discrepancy to the photos.
no code implementations • 5 Nov 2020 • Hao Zhu, Yi Li, Feixia Zhu, Aihua Zheng, Ran He
We propose a new task named Audio-driven Per-formance Video Generation (APVG), which aims to synthesizethe video of a person playing a certain instrument guided bya given music audio clip.
1 code implementation • 7 Jul 2020 • Bo Jiang, Sheng Wang, Xiao Wang, Aihua Zheng
Specifically, STADB first obtains an attention map by channel-wise pooling and returns a drop mask by thresholding the attention map.
no code implementations • 14 Jan 2020 • Hao Zhu, Mandi Luo, Rui Wang, Aihua Zheng, Ran He
Audio-visual learning, aimed at exploiting the relationship between audio and visual modalities, has drawn considerable attention since deep learning started to be used successfully.
no code implementations • 17 Jul 2019 • Chenglong Li, Andong Lu, Aihua Zheng, Zhengzheng Tu, Jin Tang
In a specific, the generality adapter is to extract shared object representations, the modality adapter aims at encoding modality-specific information to deploy their complementary advantages, and the instance adapter is to model the appearance properties and temporal variations of a certain object.
no code implementations • 22 May 2019 • Hongchao Li, Xianmin Lin, Aihua Zheng, Chenglong Li, Bin Luo, Ran He, Amir Hussain
In particular, our network is end-to-end trained and contains three subnetworks of deep features embedded by the corresponding attributes (i. e., camera view, vehicle type and vehicle color).
1 code implementation • 22 Jan 2019 • Xiao Wang, Shaofei Zheng, Rui Yang, Aihua Zheng, Zhe Chen, Jin Tang, Bin Luo
We also review some popular network architectures which have been widely applied in the deep learning community.
no code implementations • 17 Dec 2018 • Hao Zhu, Huaibo Huang, Yi Li, Aihua Zheng, Ran He
Talking face generation aims to synthesize a face video with precise lip synchronization as well as a smooth transition of facial motion over the entire video via the given speech clip and facial image.
1 code implementation • 11 Jan 2017 • Chenglong Li, Guizhao Wang, Yunpeng Ma, Aihua Zheng, Bin Luo, Jin Tang
In particular, we introduce a weight for each modality to describe the reliability, and integrate them into the graph-based manifold ranking algorithm to achieve adaptive fusion of different source data.