1 code implementation • 12 Mar 2025 • Xiuwen Fang, Mang Ye, Bo Du
This paper studies a challenging robust federated learning task with model heterogeneous and data corrupted clients, where the clients have different local model structures.
no code implementations • 11 Mar 2025 • Zitong Shi, Guancheng Wan, Wenke Huang, Guibin Zhang, Jiawei Shao, Mang Ye, Carl Yang
LLM-based Multi-Agent Systems (MAS) have proven highly effective in solving complex problems by integrating multiple agents, each performing different roles.
no code implementations • 6 Mar 2025 • Bin Wu, Wuxuan Shi, Jinqiao Wang, Mang Ye
Pre-trained Vision-Language Models (VLMs) require Continual Learning (CL) to efficiently update their knowledge and adapt to various downstream tasks without retraining from scratch.
1 code implementation • 14 Feb 2025 • Mang Ye, Xuankun Rong, Wenke Huang, Bo Du, Nenghai Yu, DaCheng Tao
With the rapid advancement of Large Vision-Language Models (LVLMs), ensuring their safety has emerged as a crucial area of research.
2 code implementations • 17 Nov 2024 • Wenke Huang, Jian Liang, Zekun Shi, Didi Zhu, Guancheng Wan, He Li, Bo Du, DaCheng Tao, Mang Ye
To balance the trade-off between generalization and specialization, we propose measuring the parameter importance for both pre-trained and fine-tuning distributions, based on frozen pre-trained weight magnitude and accumulated fine-tuning gradient values.
1 code implementation • 26 Oct 2024 • Zihan Tan, Guancheng Wan, Wenke Huang, Mang Ye
Personalized Federated Graph Learning (pFGL) facilitates the decentralized training of Graph Neural Networks (GNNs) without compromising privacy while accommodating personalized requirements for non-IID participants.
no code implementations • 9 Oct 2024 • Chenyue Li, Shuoyi Chen, Mang Ye
Wildlife ReID involves utilizing visual technology to identify specific individuals of wild animals in different scenarios, holding significant importance for wildlife conservation, ecological research, and environmental monitoring.
1 code implementation • 27 Jun 2024 • Wenke Huang, Guancheng Wan, Mang Ye, Bo Du
First, for node-level semantics, we find that contrasting nodes from distinct classes is beneficial to provide a well-performing discrimination.
1 code implementation • 24 Jun 2024 • Qu Yang, Mang Ye, Bo Du
Experimental results demonstrate that EmoLLM significantly elevates multimodal emotional understanding performance, with an average improvement of 12. 1% across multiple foundation models on EmoBench.
no code implementations • 9 Jun 2024 • Tangfei Liao, Xiaoqin Zhang, Guobao Xiao, Min Li, Tao Wang, Mang Ye
To tackle these challenges, we propose a pre-training method to acquire a generic inliers-consistent representation by reconstructing masked correspondences, providing a strong initial representation for downstream tasks.
1 code implementation • CVPR 2024 • Yuhang Chen, Wenke Huang, Mang Ye
It leads to biased model convergence objective and distinct performance among domains.
1 code implementation • 25 May 2024 • Mang Ye, Wei Shen, Bo Du, Eduard Snezhko, Vassili Kovalev, Pong C. Yuen
Vertical Federated Learning (VFL) is a privacy-preserving distributed learning paradigm where different parties collaboratively learn models using partitioned features of shared samples, without leaking private data.
no code implementations • CVPR 2024 • He Li, Mang Ye, Ming Zhang, Bo Du
In Re-identification (ReID), recent advancements yield noteworthy progress in both unimodal and cross-modal retrieval tasks.
no code implementations • 28 Apr 2024 • Daming Gao, Yang Bai, Min Cao, Hao Dou, Mang Ye, Min Zhang
Text-based person search (TBPS) aims to retrieve images of a specific person from a large image gallery based on a natural language description.
1 code implementation • 13 Jan 2024 • Mang Ye, Shuoyi Chen, Chenyue Li, Wei-Shi Zheng, David Crandall, Bo Du
Object Re-identification (Re-ID) aims to identify specific objects across different times and scenes, which is a widely researched task in computer vision.
no code implementations • CVPR 2024 • Kaili Sun, Zhiwen Xie, Mang Ye, Huyin Zhang
Multimodal intent recognition (MIR) aims to perceive the human intent polarity via language visual and acoustic modalities.
2 code implementations • CVPR 2024 • Xiyuan Yang, Wenke Huang, Mang Ye
In PFL clients update their shared parameters to communicate and learn from others while keeping personalized parts unchanged leading to poor coordination between these two components.
1 code implementation • CVPR 2024 • Bin Yang, Jun Chen, Mang Ye
Unsupervised visible-infrared person re-identification (US-VI-ReID) centers on learning a cross-modality retrieval model without labels reducing the reliance on expensive cross-modality manual annotation.
2 code implementations • 10 Dec 2023 • Xu Zhang, Hao Li, Mang Ye
Since clean samples are easier distinguished by GMM with increasing noise, the memory bank can still maintain high quality at a high noise ratio.
Cross-modal retrieval with noisy correspondence
Image-text matching
+3
1 code implementation • 12 Nov 2023 • Wenke Huang, Mang Ye, Zekun Shi, Guancheng Wan, He Li, Bo Du, Qiang Yang
In this survey, we provide a systematic overview of the important and recent developments of research on federated learning.
3 code implementations • ACM Multimedia 2022 • Shuoyi Chen, Mang Ye, Bo Du
Existing methods are usually designed for city cameras, incapable of handing the rotation issue in UAV scenarios.
2 code implementations • 28 Sep 2023 • Wenke Huang, Mang Ye, Zekun Shi, Bo Du
Federated learning is an important privacy-preserving multi-party learning paradigm, involving collaborative learning with others and local updating on private data.
no code implementations • 20 Aug 2023 • Yunlu Yan, Chun-Mei Feng, Mang Ye, WangMeng Zuo, Ping Li, Rick Siow Mong Goh, Lei Zhu, C. L. Philip Chen
Concretely, FedCSD introduces a class prototype similarity distillation to align the local logits with the refined global logits that are weighted by the similarity between local logits and the global prototype.
1 code implementation • 19 Aug 2023 • Min Cao, Yang Bai, Ziyin Zeng, Mang Ye, Min Zhang
TPBS, as a fine-grained cross-modal retrieval task, is also facing the rise of research on the CLIP-based TBPS.
Ranked #6 on
Text based Person Retrieval
on RSTPReid
2 code implementations • 20 Jul 2023 • Mang Ye, Xiuwen Fang, Bo Du, Pong C. Yuen, DaCheng Tao
Therefore, a systematic survey on this topic about the research challenges and state-of-the-art is essential.
1 code implementation • 1 Jun 2023 • Wuxuan Shi, Mang Ye, Bo Du
(2) For the cross-modality gap, we propose a novel Symmetric Uncertainty scheme to remove parts of RGB information harmful to the recovery of HR depth maps.
1 code implementation • CVPR 2023 • Ding Jiang, Mang Ye
To alleviate these issues, we present IRRA: a cross-modal Implicit Relation Reasoning and Aligning framework that learns relations between local visual-textual tokens and enhances global image-text matching without requiring additional prior supervision.
Ranked #3 on
Text-based Person Retrieval with Noisy Correspondence
on RSTPReid
(using extra training data)
1 code implementation • CVPR 2023 • Cuiqun Chen, Mang Ye, Ding Jiang
Person re-identification (ReID) with descriptive query (text or sketch) provides an important supplement for general image-image paradigms, which is usually studied in a single cross-modality matching manner, e. g., text-to-image or sketch-to-photo.
1 code implementation • ICCV 2023 • Bin Yang, Jun Chen, Mang Ye
The grand unified representation lies in two aspects: 1) GUR adopts a bottom-up domain learning strategy with a cross-memory association embedding module to explore the information of hierarchical domains, i. e., intra-camera, inter-camera, and inter-modality domains, learning a unified and robust representation against hierarchical discrepancy.
no code implementations • ICCV 2023 • Wuxuan Shi, Mang Ye
However, since the model continuously learns new knowledge, the stored prototypical representations cannot correctly model the properties of old classes in the existence of knowledge updates.
2 code implementations • CVPR 2023 • Wenke Huang, Mang Ye, Zekun Shi, He Li, Bo Du
The private model presents degenerative performance on other domains (with domain shift).
1 code implementation • ICCV 2023 • Xiuwen Fang, Mang Ye, Xiyuan Yang
Model heterogeneous federated learning is a realistic and challenging problem.
1 code implementation • CVPR 2023 • Zesen Wu, Mang Ye
In response, we devise a Progressive Graph Matching method to globally mine cross-modality correspondences under cluster imbalance scenarios.
1 code implementation • 28 Nov 2022 • Xian Zhong, Zipeng Li, Shuqin Chen, Kui Jiang, Chen Chen, Mang Ye
In this paper, we introduce a novel Refined Semantic enhancement method towards Frequency Diffusion (RSFD), a captioning model that constantly perceives the linguistic representation of the infrequent tokens.
1 code implementation • ACM MM 2022 • Bin Yang, Mang Ye, Jun Chen, Zesen Wu
Visible infrared person re-identification (VI-ReID) aims at searching out the corresponding infrared (visible) images from a gallery set captured by other spectrum cameras.
2 code implementations • Proceedings of the 30th ACM International Conference on Multimedia 2022 • He Li, Mang Ye, Cong Wang, Bo Do
The robust and discriminative feature extraction is the key component in person re-identification (Re-ID).
4 code implementations • Proceedings of the 30th ACM International Conference on Multimedia 2022 • Wenke Huang, Mang Ye, Bo Du, Xiang Gao
To address these issues, this paper presents a novel framework with two main parts: 1) model agnostic federated learning, it performs public-private communication by unifying the model prediction outputs on the shared public datasets; 2) latent embedding adaptation, it addresses the domain gap with an adversarial learning scheme to discriminate the public and private domains.
1 code implementation • 24 Jul 2022 • Junwu Zhang, Mang Ye, Yao Yang
We further propose a progressive training strategy to improve the performance, which iteratively upgrades the initial anonymization supervision.
1 code implementation • CVPR 2022 • Wenke Huang, Mang Ye, Bo Du
Federated learning has emerged as an important distributed learning paradigm, which normally involves collaborative updating with others and local updating on private data.
5 code implementations • CVPR 2022 • Xiuwen Fang, Mang Ye
Model heterogeneous federated learning is a challenging task since each client independently designs its own model.
8 code implementations • 29 Nov 2021 • Eric Zhongcong Xu, Zeyang Song, Satoshi Tsutsui, Chao Feng, Mang Ye, Mike Zheng Shou
Audio-visual speaker diarization aims at detecting "who spoke when" using both auditory and visual signals.
no code implementations • 18 Aug 2021 • Haoran Peng, He Huang, Li Xu, Tianjiao Li, Jun Liu, Hossein Rahmani, Qiuhong Ke, Zhicheng Guo, Cong Wu, Rongchang Li, Mang Ye, Jiahao Wang, Jiaxu Zhang, Yuanzhong Liu, Tao He, Fuwei Zhang, Xianbin Liu, Tao Lin
In this paper, we introduce the Multi-Modal Video Reasoning and Analyzing Competition (MMVRAC) workshop in conjunction with ICCV 2021.
no code implementations • 5 May 2021 • Yongbiao Chen, Sheng Zhang, Fangxin Liu, Zhigang Chang, Mang Ye, Zhengwei Qi
Until now, the deep hashing for the image retrieval community has been dominated by convolutional neural network architectures, e. g. \texttt{Resnet}\cite{he2016deep}.
no code implementations • IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY 2021 • Dongming Wu, Mang Ye, Gaojie Lin, Xin Gao, Jianbing Shen
In addition, we propose a novel multi-head collaborative training scheme to improve the performance, which is collaboratively supervised by multiple heads with the same structure but different parameters.
2 code implementations • ICCV 2021 • Mang Ye, Weijian Ruan, Bo Du, Mike Zheng Shou
This paper introduces a powerful channel augmented joint learning strategy for the visible-infrared recognition problem.
no code implementations • ICCV 2021 • Xin Hao, Sanyuan Zhao, Mang Ye, Jianbing Shen
Cross-modality person re-identification is a challenging task due to large cross-modality discrepancy and intra-modality variations.
no code implementations • 12 Dec 2020 • Can Zhang, Hong Liu, Wei Guo, Mang Ye
RGB-Infrared person re-identification (RGB-IR Re-ID) aims to match persons from heterogeneous images captured by visible and thermal cameras, which is of great significance in the surveillance system under poor light conditions.
5 code implementations • ECCV 2020 • Mang Ye, Jianbing Shen, David J. Crandall, Ling Shao, Jiebo Luo
In this paper, we propose a novel dynamic dual-attentive aggregation (DDAG) learning method by mining both intra-modality part-level and cross-modality graph-level contextual cues for VI-ReID.
1 code implementation • 22 Jun 2020 • Mang Ye, Jianbing Shen∗
Unsupervised embedding learning aims at extracting low-dimensional visually meaningful representations from large-scale unlabeled images, which can then be directly used for similarity-based search.
no code implementations • CVPR 2020 • Mang Ye, Jianbing Shen
Unsupervised embedding learning aims at extracting low-dimensional visually meaningful representations from large-scale unlabeled images, which can then be directly used for similarity-based search.
Ranked #56 on
Image Classification
on STL-10
7 code implementations • 13 Jan 2020 • Mang Ye, Jianbing Shen, Gaojie Lin, Tao Xiang, Ling Shao, Steven C. H. Hoi
The widely studied closed-world setting is usually applied under various research-oriented assumptions, and has achieved inspiring success using deep learning techniques on a number of datasets.
Ranked #1 on
Cross-Modal Person Re-Identification
on RegDB-C
1 code implementation • CVPR 2019 • Mang Ye, Xu Zhang, Pong C. Yuen, Shih-Fu Chang
This paper studies the unsupervised embedding learning problem, which requires an effective similarity measurement between samples in low-dimensional embedding space.
no code implementations • ECCV 2018 • Mang Ye, Xiangyuan Lan, Pong C. Yuen
After that, a robust and efficient top-k counts label prediction strategy is proposed to predict the labels of unlabeled image sequences.
Ranked #11 on
Person Re-Identification
on PRID2011
Representation Learning
Video-Based Person Re-Identification
no code implementations • ICCV 2017 • Mang Ye, Andy J. Ma, Liang Zheng, Jiawei Li, P C Yuen
Label estimation is an important component in an unsupervised person re-identification (re-ID) system.
Ranked #8 on
Person Re-Identification
on PRID2011