1 code implementation • 2 Apr 2025 • Chunhui Zhang, Li Liu, Jialin Gao, Xin Sun, Hao Wen, Xi Zhou, Shiming Ge, Yanfeng Wang
In this work, we propose COST, a contrastive one-stage transformer fusion framework for VL tracking, aiming to learn semantically consistent and unified VL representations.
no code implementations • 6 Nov 2024 • Pengju Wang, Bochao Liu, Weijia Guo, Yong Li, Shiming Ge
By applying knowledge distillation, we effectively transfer global generalized knowledge and historical personalized knowledge to the local model, thus mitigating catastrophic forgetting and enhancing the general performance of personalized models.
no code implementations • 24 Sep 2024 • Pengju Wang, Bochao Liu, Dan Zeng, Chenggang Yan, Shiming Ge
These weights are then aggregated to create a global backbone, which is returned to each client for updating.
1 code implementation • 20 Sep 2024 • Jianghu Lu, Shikun Li, Kexin Bao, Pengju Wang, Zhenxing Qian, Shiming Ge
Inspired by this, we propose a label-masking distillation approach termed FedLMD to facilitate federated learning via perceiving the various label distributions of each client.
no code implementations • 20 Sep 2024 • Yingying Hua, Shiming Ge, Daichi Zhang
After that, using the labels annotated by the deep network as teacher, a linear student model is trained to approximate the annotations by mapping these synthetic images to the classes.
no code implementations • 19 Sep 2024 • Bochao Liu, Jianghu Lu, Pengju Wang, Junjie Zhang, Dan Zeng, Zhenxing Qian, Shiming Ge
The main idea is generating synthetic data to learn a student that can mimic the ability of a teacher well-trained on private data.
no code implementations • 19 Sep 2024 • Chenyu Li, Shiming Ge, Daichi Zhang, Jia Li
Many real-world applications today like video surveillance and urban governance need to address the recognition of masked faces, where content replacement by diverse masks often brings in incomplete appearance and ambiguous representation, leading to a sharp drop in accuracy.
no code implementations • 18 Sep 2024 • Shiming Ge, Zhao Luo, Chunhui Zhang, Yingying Hua, DaCheng Tao
However, these networks are too complex to represent a specific moving object, leading to poor generalization as well as high computational and memory costs.
no code implementations • 18 Sep 2024 • Shiming Ge, Shengwei Zhao, Chenyu Li, Yu Zhang, Jia Li
Face recognition in the wild is now advancing towards light-weight models, fast inference speed and resolution-adapted capability.
no code implementations • 10 Sep 2024 • Junzheng Zhang, Weijia Guo, Bochao Liu, Ruixin Shi, Yong Li, Shiming Ge
After that, the discriminative representation distillation further considers a pretrained face recognizer as the discriminative teacher to supervise the learning of the student head via cross-resolution relational contrastive distillation.
no code implementations • 9 Sep 2024 • Shiming Ge, Kangkai Zhang, Haolin Liu, Yingying Hua, Shengwei Zhao, Xin Jin, Hao Wen
In spite of great success in many image recognition tasks achieved by recent deep models, directly applying them to recognize low-resolution images may suffer from low accuracy due to the missing of informative details during resolution degradation.
no code implementations • 4 Sep 2024 • Kangkai Zhang, Shiming Ge, Ruixin Shi, Dan Zeng
Recent studies have shown that knowledge distillation approaches can effectively transfer knowledge from a high-resolution teacher model to a low-resolution student model by aligning cross-resolution representations.
no code implementations • 4 Sep 2024 • Shiming Ge, Bochao Liu, Pengju Wang, Yong Li, Dan Zeng
In this work, we propose a discriminative-generative distillation approach to learn privacy-preserving deep models.
no code implementations • 3 Sep 2024 • Ruixin Shi, Weijia Guo, Shiming Ge
In this manner, the capability of recovering missing details of familiar low-resolution faces can be effectively enhanced, leading to a better knowledge transfer.
no code implementations • 27 Aug 2024 • Bochao Liu, Pengju Wang, Shiming Ge
Specifically, we first train a diffusion model as a teacher and then train a student by distillation, in which we achieve differential privacy by adding noise to the gradients from other models to the student.
no code implementations • 15 Jul 2024 • Daichi Zhang, Zihao Xiao, Shikun Li, Fanzhao Lin, Jianmin Li, Shiming Ge
To this end, we propose to learn the Natural Consistency representation (NACO) of real face videos in a self-supervised manner, which is inspired by the observation that fake videos struggle to maintain the natural spatiotemporal consistency even under unknown forgery methods and different perturbations.
1 code implementation • 3 Jun 2024 • Hansong Zhang, Shikun Li, Fanzhao Lin, Weiping Wang, Zhenxing Qian, Shiming Ge
Specifically, from the inner-class view, we construct multiple "middle encoders" to perform pseudo long-term distribution alignment, making the condensed set a good proxy of the real one during the whole training process; while from the inter-class view, we use the expert models to perform distribution calibration, ensuring the synthetic data remains in the real class region during condensing.
no code implementations • 27 May 2024 • Shiming Ge, Weijia Guo, Chenyu Li, Junzheng Zhang, Yong Li, Dan Zeng
First, we leverage a generative encoder pretrained for face inpainting and finetune it to represent masked faces into category-aware descriptors.
2 code implementations • 26 Dec 2023 • Hansong Zhang, Shikun Li, Pengju Wang, Dan Zeng, Shiming Ge
Nowadays, optimization-oriented methods have been the primary method in the field of dataset condensation for achieving SOTA results.
2 code implementations • 12 Dec 2023 • Hansong Zhang, Shikun Li, Dan Zeng, Chenggang Yan, Shiming Ge
Moreover, we cluster the ``annotator groups'' who share similar expertise so that their confusion matrices could be corrected together.
1 code implementation • 22 Sep 2023 • Shikun Li, Xiaobo Xia, Hansong Zhang, Shiming Ge, Tongliang Liu
However, estimating multi-label noise transition matrices remains a challenging task, as most existing estimators in noisy multi-class learning rely on anchor points and accurate fitting of noisy class posteriors, which is hard to satisfy in noisy multi-label learning.
no code implementations • 9 Sep 2023 • Daichi Zhang, Zihao Xiao, Jianmin Li, Shiming Ge
Specifically, we first model the spatiotemporal patterns of face videos by incorporating a lightweight CNN to extract local spatial features of each frame and then cascading a vision transformer to learn the long-term spatiotemporal representations in latent space, which should contain more clues than in pixel space.
1 code implementation • 5 Jun 2023 • Shikun Li, Xiaobo Xia, Jiankang Deng, Shiming Ge, Tongliang Liu
In real-world crowd-sourcing scenarios, noise transition matrices are both annotator- and instance-dependent.
no code implementations • 18 May 2023 • Bochao Liu, Pengju Wang, Weijia Guo, Yong Li, Liansheng Zhuang, Weiping Wang, Shiming Ge
In this work, we present a new private generative modeling approach where samples are generated via Hamiltonian dynamics with gradients of the private dataset estimated by a well-trained network.
no code implementations • 14 Jul 2022 • Daichi Zhang, Fanzhao Lin, Yingying Hua, Pengju Wang, Dan Zeng, Shiming Ge
Existing image-level approaches often focus on single frame and ignore the spatiotemporal cues hidden in deepfake videos, resulting in poor generalization and robustness.
1 code implementation • 12 Jun 2022 • Qichao Ying, Xiaoxiao Hu, Yangming Zhou, Zhenxing Qian, Dan Zeng, Shiming Ge
Representations from each view are separately used to coarsely predict the fidelity of the whole news, and the multimodal representations are able to predict the cross-modal consistency.
1 code implementation • 30 May 2022 • Chaojian Yu, Bo Han, Mingming Gong, Li Shen, Shiming Ge, Bo Du, Tongliang Liu
Based on these observations, we propose a robust perturbation strategy to constrain the extent of weight perturbation.
1 code implementation • 8 Mar 2022 • Shikun Li, Tongliang Liu, Jiyong Tan, Dan Zeng, Shiming Ge
This raises the following important question: how can we effectively use a small amount of trusted data to facilitate robust classifier learning from multiple annotators?
1 code implementation • CVPR 2022 • Shikun Li, Xiaobo Xia, Shiming Ge, Tongliang Liu
In the selection process, by measuring the agreement between learned representations and given labels, we first identify confident examples that are exploited to build confident pairs.
Ranked #11 on
Image Classification
on mini WebVision 1.0
1 code implementation • 19 Jan 2022 • Chunhui Zhang, Guanjie Huang, Li Liu, Shan Huang, Yinan Yang, Xiang Wan, Shiming Ge, DaCheng Tao
In this work, we propose WebUAV-3M, the largest public UAV tracking benchmark to date, to facilitate both the development and evaluation of deep UAV trackers.
no code implementations • 23 Aug 2021 • Jian Zhao, Gang Wang, Jianan Li, Lei Jin, Nana Fan, Min Wang, Xiaojuan Wang, Ting Yong, Yafeng Deng, Yandong Guo, Shiming Ge, Guodong Guo
The 2nd Anti-UAV Workshop \& Challenge aims to encourage research in developing novel and accurate methods for multi-scale object tracking.
no code implementations • 21 Jun 2021 • Yingying Hua, Daichi Zhang, Pengju Wang, Shiming Ge
The approach could make the face manipulation detection process transparent by embedding the feature whitening module.
no code implementations • 23 Mar 2021 • Kangkai Zhang, Chunhui Zhang, Shikun Li, Dan Zeng, Shiming Ge
Inspired by that, we propose an evolutionary knowledge distillation approach to improve the transfer effectiveness of teacher knowledge.
no code implementations • 31 Aug 2020 • Guanshuo Wang, Yufeng Yuan, Jiwei Li, Shiming Ge, Xi Zhou
Current stripe-based feature learning approaches have delivered impressive accuracy, but do not make a proper trade-off between diversity, locality, and robustness, which easily suffers from part semantic inconsistency for the conflict between rigid partition and misalignment.
no code implementations • 9 Mar 2020 • Jialin Gao, Zhixiang Shi, Jiani Li, Guanshuo Wang, Yufeng Yuan, Shiming Ge, Xi Zhou
Accurate temporal action proposals play an important role in detecting actions from untrimmed videos.
no code implementations • 24 Dec 2019 • Jialin Gao, Tong He, Xi Zhou, Shiming Ge
A collection of approaches based on graph convolutional networks have proven success in skeleton-based action recognition by exploring neighborhood information and dense dependencies between intra-frame joints.
Ranked #47 on
Skeleton Based Action Recognition
on NTU RGB+D
2 code implementations • 11 Jul 2019 • Xin Jin, Le Wu, Geng Zhao, Xiao-Dong Li, Xiaokun Zhang, Shiming Ge, Dongqing Zou, Bin Zhou, Xinghui Zhou
This is a new formula of image aesthetic assessment, which predicts aesthetic attributes captions together with the aesthetic score of each attribute.
no code implementations • 25 May 2019 • Yangru Huang, Peixi Peng, Yi Jin, Yidong Li, Junliang Xing, Shiming Ge
In this approach, a domain adaptive attention model is learned to separate the feature map into domain-shared part and domain-specific part.
no code implementations • 10 Apr 2019 • Jia Li, Kui Fu, Shengwei Zhao, Shiming Ge
In this approach, five components are involved, including two teachers, two students and the desired spatiotemporal model.
no code implementations • 9 Apr 2019 • Kui Fu, Peipei Shi, Yafei Song, Shiming Ge, Xiangju Lu, Jia Li
To address these issues, we design an extremely light-weight network with ultrafast speed, named UVA-Net.
no code implementations • 25 Nov 2018 • Shiming Ge, Shengwei Zhao, Chenyu Li, Jia Li
In this approach, a two-stream convolutional neural network (CNN) is first initialized to recognize high-resolution faces and resolution-degraded faces with a teacher stream and a student stream, respectively.
no code implementations • 23 Aug 2017 • Xin Jin, Yannan Li, Ningning Liu, Xiao-Dong Li, Xianggang Jiang, Chaoen Xiao, Shiming Ge
We propose a novel outdoor scene relighting method, which needs only a single reference image and is based on material constrained layer decomposition.
2 code implementations • 23 Aug 2017 • Xin Jin, Le Wu, Xiao-Dong Li, Siyu Chen, Siwei Peng, Jingying Chi, Shiming Ge, Chenggen Song, Geng Zhao
Thus, a novel CNN based on the Cumulative distribution with Jensen-Shannon divergence (CJS-CNN) is presented to predict the aesthetic score distribution of human ratings, with a new reliability-sensitive learning method based on the kurtosis of the score distribution, which eliminates the requirement of the original full data of human ratings (without normalization).
no code implementations • 9 Aug 2017 • Xin Jin, Shiming Ge, Chenggen Song
The experimental results reveal that our protocol can successfully retrieve the proper photos from the cloud server and protect the user photos and the face detector.
no code implementations • CVPR 2017 • Shiming Ge, Jia Li, Qiting Ye, Zhao Luo
Detecting masked faces (i. e., faces with occlusions) is a challenging task due to two main reasons: 1)the absence of large datasets of masked faces, and 2)the absence of facial cues from the masked regions.
no code implementations • 27 Feb 2017 • Xin Jin, Peng Yuan, Xiao-Dong Li, Chenggen Song, Shiming Ge, Geng Zhao, Yingya Chen
Only the base images are submitted randomly to the cloud server.
2 code implementations • 7 Oct 2016 • Xin Jin, Le Wu, Xiao-Dong Li, Xiaokun Zhang, Jingying Chi, Siwei Peng, Shiming Ge, Geng Zhao, Shuying Li
Thus, it is easy to use a pre-trained GoogLeNet for large-scale image classification problem and fine tune our connected layers on an large scale database of aesthetic related images: AVA, i. e. \emph{domain adaptation}.