no code implementations • ECCV 2020 • Pei-Pei Li, Huaibo Huang, Yibo Hu, Xiang Wu, Ran He, Zhenan Sun
To explore the age effects on facial images, we propose a Disentangled Adversarial Autoencoder (DAAE) to disentangle the facial images into three independent factors: age, identity and extraneous information.
1 code implementation • 15 Apr 2025 • Lijun Sheng, Jian Liang, Zilei Wang, Ran He
However, due to their inherent vulnerability and the common practice of selecting from a limited set of open-source models, VLMs suffer from a higher risk of adversarial attacks than traditional vision models.
no code implementations • 14 Apr 2025 • Yanbo Wang, Jiyang Guan, Jian Liang, Ran He
Multi-modal large language models (MLLMs) have made significant progress, yet their safety alignment remains limited.
no code implementations • 12 Feb 2025 • Qianrui Teng, Xing Cui, Xuannan Liu, Peipei Li, Zekun Li, Huaibo Huang, Ran He
Personalized text-to-image models allow users to generate images of new concepts from several reference photos, thereby leading to critical concerns regarding civil privacy.
no code implementations • 7 Feb 2025 • Yueying Zou, Peipei Li, Zekun Li, Huaibo Huang, Xing Cui, Xuannan Liu, Chenghanyu Zhang, Ran He
Despite significant progress in this field, there remains a gap in literature regarding a comprehensive survey that examines the transition from domain-specific to general-purpose detection methods.
no code implementations • 26 Jan 2025 • Nan Gao, Jia Li, Huaibo Huang, Ke Shang, Ran He
Blind face restoration (BFR) is a highly challenging problem due to the uncertainty of data degradation patterns.
1 code implementation • 3 Jan 2025 • Chaoyou Fu, Haojia Lin, Xiong Wang, Yi-Fan Zhang, Yunhang Shen, Xiaoyu Liu, Haoyu Cao, Zuwei Long, Heting Gao, Ke Li, Long Ma, Xiawu Zheng, Rongrong Ji, Xing Sun, Caifeng Shan, Ran He
Recent Multimodal Large Language Models (MLLMs) have typically focused on integrating visual and textual modalities, with less emphasis placed on the role of speech in enhancing interaction.
no code implementations • 30 Dec 2024 • Jian Liang, Lijun Sheng, Hongmin Liu, Ran He
Given the potential risk of source data leakage via model inversion attacks, this paper introduces a novel setting called black-box domain adaptation, where the source model is accessible only through an API that provides the predicted label along with the corresponding confidence value for each query.
Source-Free Domain Adaptation
Unsupervised Domain Adaptation
no code implementations • 30 Dec 2024 • Zhengbo Wang, Jian Liang, Lijun Sheng, Ran He, Zilei Wang, Tieniu Tan
So far, efficient fine-tuning has become a popular strategy for enhancing the capabilities of foundation models on downstream tasks by learning plug-and-play modules.
2 code implementations • 30 Dec 2024 • Jiyang Guan, Jian Liang, Yanbo Wang, Ran He
Face recognition has witnessed remarkable advancements in recent years, thanks to the development of deep learning techniques. However, an off-the-shelf face recognition model as a commercial service could be stolen by model stealing attacks, posing great threats to the rights of the model owner. Model fingerprinting, as a model stealing detection method, aims to verify whether a suspect model is stolen from the victim model, gaining more and more attention nowadays. Previous methods always utilize transferable adversarial examples as the model fingerprint, but this method is known to be sensitive to adversarial defense and transfer learning techniques. To address this issue, we consider the pairwise relationship between samples instead and propose a novel yet simple model stealing detection method based on SAmple Correlation (SAC). Specifically, we present SAC-JC that selects JPEG compressed samples as model inputs and calculates the correlation matrix among their model outputs. Extensive results validate that SAC successfully defends against various model stealing attacks in deep face recognition, encompassing face verification and face emotion recognition, exhibiting the highest performance in terms of AUC, p-value and F1 score. Furthermore, we extend our evaluation of SAC-JC to object recognition datasets including Tiny-ImageNet and CIFAR10, which also demonstrates the superior performance of SAC-JC to previous methods. The code will be available at \url{https://github. com/guanjiyang/SAC_JC}.
1 code implementation • 29 Nov 2024 • Shukang Yin, Chaoyou Fu, Sirui Zhao, Yunhang Shen, Chunjiang Ge, Yan Yang, Zuwei Long, Yuhan Dai, Tong Xu, Xing Sun, Ran He, Caifeng Shan, Enhong Chen
In this work, our study of these approaches harvests an effective data augmentation method.
3 code implementations • 22 Nov 2024 • Chaoyou Fu, Yi-Fan Zhang, Shukang Yin, Bo Li, Xinyu Fang, Sirui Zhao, Haodong Duan, Xing Sun, Ziwei Liu, Liang Wang, Caifeng Shan, Ran He
As a prominent direction of Artificial General Intelligence (AGI), Multimodal Large Language Models (MLLMs) have garnered increased attention from both industry and academia.
1 code implementation • 14 Nov 2024 • Xuannan Liu, Xing Cui, Peipei Li, Zekun Li, Huaibo Huang, Shuhan Xia, Miaoxuan Zhang, Yueying Zou, Ran He
Consequently, understanding the methods of jailbreak attacks and existing defense mechanisms is essential to ensure the safe deployment of multimodal generative models in real-world scenarios, particularly in security-sensitive applications.
1 code implementation • 12 Nov 2024 • Qihang Fan, Huaibo Huang, Ran He
The Softmax attention mechanism in Transformer models is notoriously computationally expensive, particularly due to its quadratic complexity, posing significant challenges in vision applications.
1 code implementation • 4 Nov 2024 • Yanyi Zhang, Binglin Qiu, Qi Jia, Yu Liu, Ran He
Most incremental learners excessively prioritize coarse classes of objects while neglecting various kinds of states (e. g. color and material) attached to the objects.
1 code implementation • 20 Oct 2024 • Yuang Ai, Huaibo Huang, Ran He
In the pre-training stage, we enhance the pre-trained CLIP model by introducing a simple mechanism that scales it to higher resolutions, allowing us to extract robust degradation representations that adaptively guide the IR network.
no code implementations • 19 Sep 2024 • Xiaotian Han, Yiren Jian, Xuefeng Hu, Haogeng Liu, Yiqi Wang, Qihang Fan, Yuang Ai, Huaibo Huang, Ran He, Zhenheng Yang, Quanzeng You
Pre-training on large-scale, high-quality datasets is crucial for enhancing the reasoning capabilities of Large Language Models (LLMs), especially in specialized domains such as mathematics.
1 code implementation • 10 Aug 2024 • Jin Liu, Huaibo Huang, Jie Cao, Ran He
To blend the Consistency Features extracted from both content and style images, we introduce a Style Enhancement Attention Control technique that meticulously merges content and style features within the attention space of the target image.
1 code implementation • 9 Aug 2024 • Chaoyou Fu, Haojia Lin, Zuwei Long, Yunhang Shen, Meng Zhao, Yifan Zhang, Shaoqi Dong, Xiong Wang, Di Yin, Long Ma, Xiawu Zheng, Ran He, Rongrong Ji, Yunsheng Wu, Caifeng Shan, Xing Sun
The remarkable multimodal capabilities and interactive experience of GPT-4o underscore their necessity in practical applications, yet open-source models rarely excel in both areas.
1 code implementation • 25 Jul 2024 • Zhengbo Wang, Jian Liang, Ran He, Zilei Wang, Tieniu Tan
And this low-rank gradient can be expressed in terms of the gradients of the two low-rank matrices in LoRA.
1 code implementation • 22 Jul 2024 • Yongcan Yu, Lijun Sheng, Ran He, Jian Liang
In particular, the memory bank is dynamically updated by selecting low-entropy and label-consistent samples in a class-balanced manner.
1 code implementation • 3 Jun 2024 • Shaoshu Yang, Yong Zhang, Xiaodong Cun, Ying Shan, Ran He
Previous methods promote the frame rate by either training a video interpolation model in pixel space as a postprocessing stage or training an interpolation model in latent space for a specific base video model.
1 code implementation • 28 May 2024 • Haogeng Liu, Quanzeng You, Xiaotian Han, Yongfei Liu, Huaibo Huang, Ran He, Hongxia Yang
In the realm of Multimodal Large Language Models (MLLMs), vision-language connector plays a crucial role to link the pre-trained vision encoders with Large Language Models (LLMs).
no code implementations • 22 May 2024 • Qihang Fan, Huaibo Huang, Mingrui Chen, Ran He
In recent years, Transformers have achieved remarkable progress in computer vision tasks.
no code implementations • 22 May 2024 • Qihang Fan, Huaibo Huang, Mingrui Chen, Ran He
The Vision Transformer (ViT) has gained prominence for its superior relational modeling prowess.
no code implementations • 9 Apr 2024 • Zhida Zhang, Jie Cao, Wenkui Yang, Qihang Fan, Kai Zhou, Ran He
The transformer networks are extensively utilized in face forgery detection due to their scalability across large datasets. Despite their success, transformers face challenges in balancing the capture of global context, which is crucial for unveiling forgery clues, with computational complexity. To mitigate this issue, we introduce Band-Attention modulated RetNet (BAR-Net), a lightweight network designed to efficiently process extensive visual contexts while avoiding catastrophic forgetting. Our approach empowers the target token to perceive global information by assigning differential attention levels to tokens at varying distances.
no code implementations • 27 Mar 2024 • Qihang Fan, Quanzeng You, Xiaotian Han, Yongfei Liu, Yunzhe Tao, Huaibo Huang, Ran He, Hongxia Yang
Firstly, we propose a novel module for dynamic resolution adjustment, designed with a single Transformer block, specifically to achieve highly efficient incremental token integration.
no code implementations • 15 Mar 2024 • Nan Gao, Jia Li, Huaibo Huang, Zhi Zeng, Ke Shang, Shuwu Zhang, Ran He
Experimental results demonstrate the superiority of DiffMAC over state-of-the-art methods, with a high degree of generalization in real-world and heterogeneous settings.
no code implementations • 9 Mar 2024 • Yanyi Zhang, Qi Jia, Xin Fan, Yu Liu, Ran He
Inspired by this, we propose a novel A-O disentangled framework for CZSL, namely Class-specified Cascaded Network (CSCNet).
no code implementations • 3 Mar 2024 • Haogeng Liu, Quanzeng You, Xiaotian Han, Yiqi Wang, Bohan Zhai, Yongfei Liu, Yunzhe Tao, Huaibo Huang, Ran He, Hongxia Yang
Multimodal Large Language Models (MLLMs) have experienced significant advancements recently.
Ranked #120 on
Visual Question Answering
on MM-Vet
1 code implementation • 6 Feb 2024 • Zhengbo Wang, Jian Liang, Ran He, Zilei Wang, Tieniu Tan
This paper proposes a \textbf{C}ollabo\textbf{ra}tive \textbf{F}ine-\textbf{T}uning (\textbf{CraFT}) approach for fine-tuning black-box VLMs to downstream tasks, where one only has access to the input prompts and the output predictions of the model.
1 code implementation • 6 Feb 2024 • Zhengbo Wang, Jian Liang, Lijun Sheng, Ran He, Zilei Wang, Tieniu Tan
Extensive results on 17 datasets validate that our method surpasses or achieves comparable results with state-of-the-art methods on few-shot classification, imbalanced learning, and out-of-distribution generalization.
1 code implementation • 5 Feb 2024 • Yanbo Wang, Jian Liang, Ran He
Even for single-image reconstruction, we still lack an analysis-based algorithm to recover augmented soft labels.
1 code implementation • 4 Jan 2024 • Kuangpu Guo, Yuhe Ding, Jian Liang, Ran He, Zilei Wang, Tieniu Tan
As minority classes suffer from worse accuracy due to overfitting on local imbalanced data, prior methods often incorporate class-balanced learning techniques during local training.
no code implementations • CVPR 2024 • Jiyang Guan, Jian Liang, Ran He
In this paper we investigate the possibility of defending against backdoor attacks by utilizing test-time partially poisoned data to remove the backdoor from the model.
no code implementations • CVPR 2024 • Yuang Ai, Huaibo Huang, Xiaoqiang Zhou, Jiexiang Wang, Ran He
Extensive experiments on 16 IR tasks underscore the superiority of MPerceiver in terms of adaptiveness generalizability and fidelity.
no code implementations • 5 Dec 2023 • Yuang Ai, Huaibo Huang, Xiaoqiang Zhou, Jiexiang Wang, Ran He
Extensive experiments on 16 IR tasks underscore the superiority of MPerceiver in terms of adaptiveness, generalizability and fidelity.
1 code implementation • 3 Dec 2023 • Jin Liu, Huaibo Huang, Chao Jin, Ran He
Face stylization refers to the transformation of a face into a specific portrait style.
no code implementations • 28 Nov 2023 • Siyu Xing, Jie Cao, Huaibo Huang, Xiao-Yu Zhang, Ran He
First, we propose a coupling strategy to straighten trajectories, creating couplings between image and noise samples under diffusion model guidance.
1 code implementation • 11 Oct 2023 • Yingqing He, Shaoshu Yang, Haoxin Chen, Xiaodong Cun, Menghan Xia, Yong Zhang, Xintao Wang, Ran He, Qifeng Chen, Ying Shan
Our work also suggests that a pre-trained diffusion model trained on low-resolution images can be directly used for high-resolution visual generation without further tuning, which may provide insights for future research on ultra-high-resolution image and video synthesis.
no code implementations • 8 Oct 2023 • Haogeng Liu, Qihang Fan, Tingkai Liu, Linjie Yang, Yunzhe Tao, Huaibo Huang, Ran He, Hongxia Yang
This paper proposes Video-Teller, a video-language foundation model that leverages multi-modal fusion and fine-grained modality alignment to significantly enhance the video-to-text generation task.
1 code implementation • 8 Oct 2023 • Tingkai Liu, Yunzhe Tao, Haogeng Liu, Qihang Fan, Ding Zhou, Huaibo Huang, Ran He, Hongxia Yang
Finally, we benchmarked a wide range of current video-language models on DeVAn, and we aim for DeVAn to serve as a useful evaluation set in the age of large language models and complex multi-modal tasks.
1 code implementation • 6 Oct 2023 • Junchi Yu, Ran He, Rex Ying
These analogous problems are related to the input one, with reusable solutions and problem-solving strategies.
1 code implementation • CVPR 2024 • Qihang Fan, Huaibo Huang, Mingrui Chen, Hongmin Liu, Ran He
To alleviate these issues, we draw inspiration from the recent Retentive Network (RetNet) in the field of NLP, and propose RMT, a strong vision backbone with explicit spatial prior for general purposes.
no code implementations • 31 Aug 2023 • Linsen Song, Wayne Wu, Chaoyou Fu, Chen Change Loy, Ran He
Existing automated dubbing methods are usually designed for Professionally Generated Content (PGC) production, which requires massive training data and training time to learn a person-specific audio-video mapping.
no code implementations • 29 Aug 2023 • Haichao Shi, Mandi Luo, Xiao-Yu Zhang, Ran He
Visible-Infrared person re-identification (VI-ReID) is an important and challenging task in intelligent video surveillance.
1 code implementation • 24 Aug 2023 • Jian Liang, Lijun Sheng, Zhengbo Wang, Ran He, Tieniu Tan
The emergence of vision-language models, such as CLIP, has spurred a significant research effort towards their application for downstream supervised learning tasks.
1 code implementation • ICCV 2023 • Zhengbo Wang, Jian Liang, Ran He, Nan Xu, Zilei Wang, Tieniu Tan
Thereafter, we fine-tune CLIP with off-the-shelf methods by combining labeled and synthesized features.
1 code implementation • ICCV 2023 • Yuting Xu, Jian Liang, Gengyun Jia, Ziming Yang, Yanhao Zhang, Ran He
This paper introduces a simple yet effective strategy named Thumbnail Layout (TALL), which transforms a video clip into a pre-defined layout to realize the preservation of spatial and temporal dependencies.
1 code implementation • 6 Jul 2023 • Yongcan Yu, Lijun Sheng, Ran He, Jian Liang
To implement this benchmark, we have developed a unified framework in PyTorch, which allows for consistent evaluation and comparison of the TTA methods across the different datasets and network architectures.
2 code implementations • NeurIPS 2023 • Rui Wang, Peipei Li, Huaibo Huang, Chunshui Cao, Ran He, Zhaofeng He
Consequently, we propose a cross-modal ordinal pairwise loss to refine the CLIP feature space, where texts and images maintain both semantic alignment and ordering alignment.
no code implementations • 9 Jun 2023 • Haogeng Liu, Tao Wang, Jie Cao, Ran He, JianHua Tao
When decreasing the number of sampling steps (i. e., the number of line segments used to fit the path), the ease of fitting straight lines compared to curves allows us to generate higher quality samples from a random noise with fewer iterations.
1 code implementation • NeurIPS 2023 • Qihang Fan, Huaibo Huang, Xiaoqiang Zhou, Ran He
This paper proposes a Fully Adaptive Self-Attention (FASA) mechanism for vision transformer to model the local and global information as well as the bidirectional interaction between them in context-aware ways.
1 code implementation • 31 Mar 2023 • Qihang Fan, Huaibo Huang, Jiyang Guan, Ran He
The combination of the AttnConv and vanilla attention which uses pooling to reduce FLOPs in CloFormer enables the model to perceive high-frequency and low-frequency information.
Ranked #622 on
Image Classification
on ImageNet
1 code implementation • CVPR 2024 • Yuang Ai, Xiaoqiang Zhou, Huaibo Huang, Lei Zhang, Ran He
Unsupervised Domain Adaptation (UDA) can effectively address domain gap issues in real-world image Super-Resolution (SR) by accessing both the source and target data.
1 code implementation • CVPR 2023 • Junchi Yu, Jian Liang, Ran He
Recent works employ different graph editions to generate augmented environments and learn an invariant GNN for generalization.
1 code implementation • 27 Mar 2023 • Jian Liang, Ran He, Tieniu Tan
Test-time adaptation (TTA), an emerging paradigm, has the potential to adapt a pre-trained model to unlabeled data during testing, before making predictions.
1 code implementation • 22 Mar 2023 • Puning Yang, Jian Liang, Jie Cao, Ran He
Out-of-distribution (OOD) detection is a crucial aspect of deploying machine learning models in open-world applications.
no code implementations • ICCV 2023 • Peipei Li, Rui Wang, Huaibo Huang, Ran He, Zhaofeng He
Face aging is an ill-posed problem because multiple plausible aging patterns may correspond to a given input.
1 code implementation • ICCV 2023 • Lijun Sheng, Jian Liang, Ran He, Zilei Wang, Tieniu Tan
To address this issue, we propose a model preprocessing framework, named AdaptGuard, to improve the security of model adaptation algorithms.
1 code implementation • 17 Mar 2023 • Yuhe Ding, Jian Liang, Jie Cao, Aihua Zheng, Ran He
Briefly, MODIFY first trains a generative model in the target domain and then translates a source input to the target domain via the provided style model.
2 code implementations • 2023 2023 • Ziming Yang, Jian Liang, Yuting Xu, Xiao-Yu Zhang, Ran He
A relation learning module masks partial correlations between regions to reduce redundancy and then propagates the relational information across regions to capture the irregularity from a global view of the graph.
1 code implementation • 9 Feb 2023 • Yuhe Ding, Jian Liang, Bo Jiang, Aihua Zheng, Ran He
Existing cross-domain keypoint detection methods always require accessing the source data during adaptation, which may violate the data privacy law and pose serious security concerns.
no code implementations • ICCV 2023 • Xiaoqiang Zhou, Huaibo Huang, Ran He, Zilei Wang, Jie Hu, Tieniu Tan
In particular, self-attention with cross-scale matching and convolution filters with different kernel sizes are designed to exploit the multi-scale features in images.
1 code implementation • CVPR 2023 • Huaibo Huang, Xiaoqiang Zhou, Jie Cao, Ran He, Tieniu Tan
STA decomposes vanilla global attention into multiplications of a sparse association map and a low-dimensional attention, leading to high efficiency in capturing global dependencies.
no code implementations • 27 Oct 2022 • Jie Cao, Mandi Luo, Junchi Yu, Ming-Hsuan Yang, Ran He
Then, we optimize the augmented samples by minimizing the norms of the data scores, i. e., the gradients of the log-density functions.
1 code implementation • 21 Oct 2022 • Jiyang Guan, Jian Liang, Ran He
To reduce the training time, we further develop SAC-m that selects CutMix Augmented samples as model inputs, without the need for training the surrogate models or generating adversarial examples.
1 code implementation • 11 Oct 2022 • Zi Wang, Huaibo Huang, Aihua Zheng, Chenglong Li, Ran He
To alleviate these two issues, we propose a simple yet effective method with Parallel Augmentation and Dual Enhancement (PADE), which is robust on both occluded and non-occluded data and does not require any auxiliary clues.
no code implementations • 19 Jun 2022 • Junchi Yu, Jian Liang, Ran He
Extensive experiments on both node-level and graph-level benchmarks shows that the proposed DPS achieves impressive performance for various graph domain generalization tasks.
2 code implementations • 29 May 2022 • Yuhe Ding, Lijun Sheng, Jian Liang, Aihua Zheng, Ran He
First of all, to avoid additional parameters and explore the information in the source model, ProxyMix defines the weights of the classifier as the class prototypes and then constructs a class-balanced proxy source domain by the nearest neighbors of the prototypes to bridge the unseen source domain and the target domain.
no code implementations • 20 May 2022 • Bingzhe Wu, Jintang Li, Junchi Yu, Yatao Bian, Hengtong Zhang, Chaochao Chen, Chengbin Hou, Guoji Fu, Liang Chen, Tingyang Xu, Yu Rong, Xiaolin Zheng, Junzhou Huang, Ran He, Baoyuan Wu, Guangyu Sun, Peng Cui, Zibin Zheng, Zhe Liu, Peilin Zhao
Deep graph learning has achieved remarkable progresses in both business and scientific areas ranging from finance and e-commerce, to drug and advanced material discovery.
no code implementations • 2 Mar 2022 • Jia Li, Jie Cao, Junxian Duan, Ran He
We propose a new challenging task namely IDentity Stylization (IDS) across heterogeneous domains.
no code implementations • CVPR 2022 • Gengyun Jia, Huaibo Huang, Chaoyou Fu, Ran He
In this paper, we regard image cropping as a set prediction problem.
no code implementations • CVPR 2022 • Jiyang Guan, Zhuozhuo Tu, Ran He, DaCheng Tao
Deep neural networks have achieved impressive performance in a variety of tasks over the last decade, such as autonomous driving, face recognition, and medical diagnosis.
no code implementations • 18 Dec 2021 • Junchi Yu, Tingyang Xu, Ran He
In this work, we address these key challenges and propose IFEXPLAINER, which generates a necessary and sufficient explanation for GNNs.
1 code implementation • CVPR 2022 • Junchi Yu, Jie Cao, Ran He
Subgraph recognition aims at discovering a compressed substructure of a graph that is most informative to the graph property.
no code implementations • 16 Dec 2021 • Jian Liang, Dapeng Hu, Jiashi Feng, Ran He
To achieve bilateral adaptation in the target domain, we further maximize localized mutual information to align known samples with the source classifier and employ an entropic loss to push unknown samples far away from the source classification boundary, respectively.
Ranked #7 on
Universal Domain Adaptation
on VisDA2017
no code implementations • 20 Oct 2021 • Jianze Wei, Huaibo Huang, Muyi Sun, Yunlong Wang, Min Ren, Ran He, Zhenan Sun
To make further efforts on accurate and reliable iris segmentation, we propose a bilateral self-attention module and design Bilateral Transformer (BiTrans) with hierarchical architecture by exploring spatial and visual relationships.
no code implementations • 4 Oct 2021 • Gege Gao, Huaibo Huang, Chaoyou Fu, Ran He
Human face synthesis involves transferring knowledge about the identity and identity-dependent face shape (IDFS) of a human face to target face images where the context (e. g., facial expressions, head poses, and other background factors) may change dramatically.
no code implementations • 3 Oct 2021 • Jia Li, Huaibo Huang, Xiaofei Jia, Ran He
Blind face restoration (BFR) is a challenging problem because of the uncertainty of the degradation patterns.
no code implementations • CVPR 2021 • Jia Li, Zhaoyang Li, Jie Cao, Xingguang Song, Ran He
In this work, we propose a novel two-stage framework named FaceInpainter to implement controllable Identity-Guided Face Inpainting (IGFI) under heterogeneous domains.
no code implementations • CVPR 2021 • Huaibo Huang, Aijing Yu, Ran He
To address this issue, we propose a memory-oriented semi-supervised (MOSS) method which enables the network to explore and exploit the properties of rain streaks from both synthetic and real data.
no code implementations • CVPR 2021 • Linsen Song, Wayne Wu, Chaoyou Fu, Chen Qian, Chen Change Loy, Ran He
We present a new application direction named Pareidolia Face Reenactment, which is defined as animating a static illusory face to move in tandem with a human face in the video.
1 code implementation • CVPR 2021 • Gege Gao, Huaibo Huang, Chaoyou Fu, Zhaoyang Li, Ran He
In this work, we propose a novel information disentangling and swapping network, called InfoSwap, to extract the most expressive information for identity representation from a pre-trained face recognition model.
1 code implementation • 26 May 2021 • Si Liu, Wentao Jiang, Chen Gao, Ran He, Jiashi Feng, Bo Li, Shuicheng Yan
In this paper, we address the makeup transfer and removal tasks simultaneously, which aim to transfer the makeup from a reference image to a source image and remove the makeup from the with-makeup image respectively.
1 code implementation • 7 Apr 2021 • Linsen Song, Wayne Wu, Chaoyou Fu, Chen Qian, Chen Change Loy, Ran He
We present a new application direction named Pareidolia Face Reenactment, which is defined as animating a static illusory face to move in tandem with a human face in the video.
3 code implementations • CVPR 2022 • Jian Liang, Dapeng Hu, Jiashi Feng, Ran He
To ease the burden of labeling, unsupervised domain adaptation (UDA) aims to transfer knowledge in previous and related labeled datasets (sources) to a new unlabeled dataset (target).
1 code implementation • CVPR 2021 • Jie Cao, Luanxuan Hou, Ming-Hsuan Yang, Ran He, Zhenan Sun
We interpolate training samples at the feature level and propose a novel content loss based on the perceptual relations among samples.
no code implementations • 20 Mar 2021 • Junchi Yu, Tingyang Xu, Yu Rong, Yatao Bian, Junzhou Huang, Ran He
The emergence of Graph Convolutional Network (GCN) has greatly boosted the progress of graph learning.
1 code implementation • ICCV 2021 • Chaoyou Fu, Yibo Hu, Xiang Wu, Hailin Shi, Tao Mei, Ran He
Visible-Infrared person re-identification (VI-ReID) aims to match cross-modality pedestrian images, breaking through the limitation of single-modality person ReID in dark environment.
2 code implementations • 14 Dec 2020 • Jian Liang, Dapeng Hu, Yunbo Wang, Ran He, Jiashi Feng
Furthermore, we propose a new labeling transfer strategy, which separates the target data into two splits based on the confidence of predictions (labeling information), and then employ semi-supervised learning to improve the accuracy of less-confident predictions in the target domain.
Ranked #5 on
Source-Free Domain Adaptation
on VisDA-2017
no code implementations • 10 Nov 2020 • Yuhe Ding, Xin Ma, Mandi Luo, Aihua Zheng, Ran He
Considering the intuitive artifacts in the existing methods, we propose a contrastive style loss for style rendering to enforce the similarity between the style of rendered photo and the caricature, and simultaneously enhance its discrepancy to the photos.
no code implementations • 5 Nov 2020 • Hao Zhu, Yi Li, Feixia Zhu, Aihua Zheng, Ran He
We propose a new task named Audio-driven Per-formance Video Generation (APVG), which aims to synthesizethe video of a person playing a certain instrument guided bya given music audio clip.
no code implementations • NeurIPS 2020 • Hao Zhu, Chaoyou Fu, Qianyi Wu, Wayne Wu, Chen Qian, Ran He
However, due to the lack of Deepfakes datasets with large variance in appearance, which can be hardly produced by recent identity swapping methods, the detection algorithm may fail in this situation.
Ranked #7 on
Visual Object Tracking
on DiDi
no code implementations • 29 Oct 2020 • Xin Ma, Xiaoqiang Zhou, Huaibo Huang, Zhenhua Chai, Xiaolin Wei, Ran He
It is difficult for encoders to capture such powerful representations under this complex situation.
no code implementations • 26 Oct 2020 • Luanxuan Hou, Jie Cao, Yuan Zhao, Haifeng Shen, Jian Tang, Ran He
We propose a refinement stage for the pyramid features to further boost the accuracy of our network.
1 code implementation • ICLR 2021 • Junchi Yu, Tingyang Xu, Yu Rong, Yatao Bian, Junzhou Huang, Ran He
In this paper, we propose a framework of Graph Information Bottleneck (GIB) for the subgraph recognition problem in deep graph learning.
1 code implementation • 20 Sep 2020 • Chaoyou Fu, Xiang Wu, Yibo Hu, Huaibo Huang, Ran He
As a consequence, massive new diverse paired heterogeneous images with the same identity can be generated from noises.
no code implementations • 17 Sep 2020 • Chaoyou Fu, Guoli Wang, Xiang Wu, Qian Zhang, Ran He
It embodies the uncertainty of the hashing network to the corresponding input image.
1 code implementation • ECCV 2020 • Yibo Hu, Xiang Wu, Ran He
In this paper, we rethink three freedoms of differentiable NAS, i. e. operation-level, depth-level and width-level, and propose a novel method, named Three-Freedom NAS (TF-NAS), to achieve both good classification accuracy and precise latency constraint.
no code implementations • 2 Jun 2020 • Chen Gao, Si Liu, Ran He, Shuicheng Yan, Bo Li
LGR module utilizes body skeleton knowledge to construct a layout graph that connects all relevant part features, where graph reasoning mechanism is used to propagate information among part nodes to mine their relations.
no code implementations • 20 Apr 2020 • Yi Li, Huaibo Huang, Junchi Yu, Ran He, Tieniu Tan
Face verification aims at determining whether a pair of face images belongs to the same identity.
no code implementations • 17 Mar 2020 • Luanxuan Hou, Jie Cao, Yuan Zhao, Haifeng Shen, Yiping Meng, Ran He, Jieping Ye
At last, we proposed a differentiable auto data augmentation method to further improve estimation accuracy.
1 code implementation • ECCV 2020 • Jian Liang, Yunbo Wang, Dapeng Hu, Ran He, Jiashi Feng
On one hand, negative transfer results in misclassification of target samples to the classes only present in the source domain.
Ranked #2 on
Partial Domain Adaptation
on ImageNet-Caltech
no code implementations • 15 Jan 2020 • Linsen Song, Wayne Wu, Chen Qian, Ran He, Chen Change Loy
The audio-translated expression parameters are then used to synthesize a photo-realistic human subject in each video frame, with the movement of the mouth regions precisely mapped to the source audio.
no code implementations • 14 Jan 2020 • Hao Zhu, Mandi Luo, Rui Wang, Aihua Zheng, Ran He
Audio-visual learning, aimed at exploiting the relationship between audio and visual modalities, has drawn considerable attention since deep learning started to be used successfully.
no code implementations • ECCV 2020 • Jie Cao, Huaibo Huang, Yi Li, Ran He, Zhenan Sun
The performance of multi-domain image-to-image translation has been significantly improved by recent progress in deep generative models.
no code implementations • 21 Dec 2019 • Xin Ma, Yi Li, Huaibo Huang, Mandi Luo, Ran He
Real-world image super-resolution (SR) is a challenging image translation problem.
no code implementations • 17 Dec 2019 • Aijing Yu, Haoxue Wu, Huaibo Huang, Zhen Lei, Ran He
A spectral conditional attention module is introduced to reduce the domain gap between NIR and VIS data and then improve the performance of NIR-VIS heterogeneous face recognition on various databases including the LAMP-HQ.
no code implementations • NeurIPS 2019 • Chaoyou Fu, Xiang Wu, Yibo Hu, Huaibo Huang, Ran He
Specifically, we first introduce a dual variational autoencoder to represent a joint distribution of paired heterogeneous images.
no code implementations • 22 Nov 2019 • Ran He, Karthik Gopinath, Christian Desrosiers, Herve Lombaert
Our alignment and graph processing method provides a fast analysis of brain surfaces.
1 code implementation • CVPR 2020 • Wentao Jiang, Si Liu, Chen Gao, Jie Cao, Ran He, Jiashi Feng, Shuicheng Yan
In this paper, we address the makeup transfer task, which aims to transfer the makeup from a reference image to a source image.
no code implementations • 12 Sep 2019 • Huseyin Uzunalioglu, Jin Cao, Chitra Phadke, Gerald Lehmann, Ahmet Akyamac, Ran He, Jeongran Lee, Maria Able
Conversion of raw data into insights and knowledge requires substantial amounts of effort from data scientists.
no code implementations • CVPR 2020 • Boyan Duan, Chaoyou Fu, Yi Li, Xingguang Song, Ran He
The cross-sensor gap is one of the challenges that have aroused much research interests in Heterogeneous Face Recognition (HFR).
no code implementations • ICCV 2019 • Shengju Qian, Kwan-Yee Lin, Wayne Wu, Yangxiaokang Liu, Quan Wang, Fumin Shen, Chen Qian, Ran He
Recent studies have shown remarkable success in face manipulation task with the advance of GANs and VAEs paradigms, but the outputs are sometimes limited to low-resolution and lack of diversity.
no code implementations • 4 Aug 2019 • Gengyun Jia, Pei-Pei Li, Ran He
RoM pooling pools image features and discards extra padded features to eliminate the side effects of padding.
no code implementations • 22 May 2019 • Hongchao Li, Xianmin Lin, Aihua Zheng, Chenglong Li, Bin Luo, Ran He, Amir Hussain
In particular, our network is end-to-end trained and contains three subnetworks of deep features embedded by the corresponding attributes (i. e., camera view, vehicle type and vehicle color).
no code implementations • 14 Apr 2019 • Jie Cao, Huaibo Huang, Yi Li, Jingtuo Liu, Ran He, Zhenan Sun
In this work, we present a novel training framework for GANs, namely biphasic learning, to achieve image-to-image translation in multiple visual domains at $1024^2$ resolution.
4 code implementations • 31 Mar 2019 • Zhihang Li, Xu Tang, Junyu Han, Jingtuo Liu, Ran He
With the rapid development of deep convolutional neural network, face detection has made great progress in recent years.
no code implementations • 30 Mar 2019 • Pei-Pei Li, Xiang Wu, Yibo Hu, Ran He, Zhenan Sun
In this paper, a new large-scale Multi-yaw Multi-pitch high-quality database is proposed for Facial Pose Analysis (M2FPA), including face frontalization, face rotation, facial pose estimation and pose-invariant face recognition.
no code implementations • 30 Mar 2019 • Pei-Pei Li, Huaibo Huang, Yibo Hu, Xiang Wu, Ran He, Zhenan Sun
UVA is the first attempt to achieve facial age analysis tasks, including age translation, age generation and age estimation, in a universal framework.
no code implementations • 28 Mar 2019 • Chaoyou Fu, Yibo Hu, Xiang Wu, Guoli Wang, Qian Zhang, Ran He
Furthermore, due to the lack of high-resolution face manipulation databases to verify the effectiveness of our method, we collect a new high-quality Multi-View Face (MVF-HQ) database.
1 code implementation • 25 Mar 2019 • Chaoyou Fu, Xiang Wu, Yibo Hu, Huaibo Huang, Ran He
Then, in order to ensure the identity consistency of the generated paired heterogeneous images, we impose a distribution alignment in the latent space and a pairwise identity preserving in the image space.
Ranked #1 on
Face Verification
on CASIA NIR-VIS 2.0
no code implementations • 10 Feb 2019 • Ran He, Jie Cao, Lingxiao Song, Zhenan Sun, Tieniu Tan
This paper models high resolution heterogeneous face synthesis as a complementary combination of two components, a texture inpainting component and pose correction component.
1 code implementation • 31 Jan 2019 • Caiyong Wang, Yuhao Zhu, Yunfan Liu, Ran He, Zhenan Sun
In this paper, we propose a deep multi-task learning framework, named as IrisParseNet, to exploit the inherent correlations between pupil, iris and sclera to boost up the performance of iris segmentation and localization in a unified model.
Ranked #1 on
Iris Segmentation
on CASIA
no code implementations • 26 Dec 2018 • Xin Zheng, Yanqing Guo, Huaibo Huang, Yi Li, Ran He
Deep learning based facial attribute analysis consists of two basic sub-issues: facial attribute estimation (FAE), which recognizes whether facial attributes are present in given images, and facial attribute manipulation (FAM), which synthesizes or removes desired facial attributes.
no code implementations • 17 Dec 2018 • Hao Zhu, Huaibo Huang, Yi Li, Aihua Zheng, Ran He
Talking face generation aims to synthesize a face video with precise lip synchronization as well as a smooth transition of facial motion over the entire video via the given speech clip and facial image.
no code implementations • 20 Sep 2018 • Pei-Pei Li, Yibo Hu, Ran He, Zhenan Sun
Inspired by the biological evolutionary mechanism, we propose a Coupled Evolutionary Network (CEN) with two concurrent evolutionary processes: evolutionary label distribution learning and evolutionary slack regression.
no code implementations • 20 Sep 2018 • Pei-Pei Li, Yibo Hu, Ran He, Zhenan Sun
%Moreover, to achieve accurate age generation under the premise of preserving the identity information, age estimation network and face verification network are employed.
no code implementations • 9 Sep 2018 • Linsen Song, Jie Cao, Linxiao Song, Yibo Hu, Ran He
Extensive experimental results qualitatively and quantitatively demonstrate that our network is able to generate visually pleasing face completion results and edit face attributes as well.
no code implementations • 7 Sep 2018 • Chaoyou Fu, Liangchen Song, Xiang Wu, Guoli Wang, Ran He
It generates hashing bits by the output neurons of a deep hashing network.
no code implementations • 6 Sep 2018 • Xiang Wu, Huaibo Huang, Vishal M. Patel, Ran He, Zhenan Sun
Visible (VIS) to near infrared (NIR) face matching is a challenging problem due to the significant domain discrepancy between the domains and a lack of sufficient data for training cross-modal matching algorithms.
Ranked #2 on
Face Verification
on CASIA NIR-VIS 2.0
3 code implementations • NeurIPS 2018 • Huaibo Huang, Zhihang Li, Ran He, Zhenan Sun, Tieniu Tan
On the other hand, the inference model is encouraged to classify between the generated and real samples while the generator tries to fool it as GANs.
no code implementations • 11 Jul 2018 • Huaibo Huang, Lingxiao Song, Ran He, Zhenan Sun, Tieniu Tan
Variational capsules model an image as a composition of entities in a probabilistic model.
no code implementations • NeurIPS 2018 • Jie Cao, Yibo Hu, Hongwen Zhang, Ran He, Zhenan Sun
We decompose the prerequisite of warping into dense correspondence field estimation and facial texture map recovering, which are both well addressed by deep networks.
no code implementations • CVPR 2018 • Yibo Hu, Xiang Wu, Bing Yu, Ran He, Zhenan Sun
Face rotation provides an effective and cheap way for data augmentation and representation learning of face recognition.
no code implementations • 21 Feb 2018 • Jie Cao, Yibo Hu, Bing Yu, Ran He, Zhenan Sun
Multi-view face synthesis from a single image is an ill-posed problem and often suffers from serious appearance distortion.
no code implementations • 25 Jan 2018 • Pei-Pei Li, Yibo Hu, Qi Li, Ran He, Zhenan Sun
To utilize both global and local facial information, we propose a Global and Local Consistent Age Generative Adversarial Network (GLCA-GAN).
no code implementations • 13 Dec 2017 • Zhihang Li, Yibo Hu, Ran He
We treat the face completion and corruption as disentangling and fusing processes of clean faces and occlusions, and propose a jointly disentangling and fusing Generative Adversarial Network (DF-GAN).
no code implementations • 10 Dec 2017 • Lingxiao Song, Zhihe Lu, Ran He, Zhenan Sun, Tieniu Tan
An expression invariant face recognition experiment is also performed to further show the advantages of our proposed method.
no code implementations • ICCV 2017 • Huaibo Huang, Ran He, Zhenan Sun, Tieniu Tan
Most modern face super-resolution methods resort to convolutional neural networks (CNN) to infer high-resolution (HR) face images.
Ranked #3 on
Face Hallucination
on FFHQ 512 x 512 - 16x upscaling
1 code implementation • 15 Sep 2017 • Yujia Chen, Lingxiao Song, Ran He
This paper introduces an Adversarial Occlusion-aware Face Detector (AOFD) by simultaneously detecting occluded faces and segmenting occluded areas.
Ranked #2 on
Occluded Face Detection
on MAFA
no code implementations • 12 Sep 2017 • Lingxiao Song, Man Zhang, Xiang Wu, Ran He
This framework integrates cross-spectral face hallucination and discriminative feature learning into an end-to-end adversarial network.
no code implementations • 12 Sep 2017 • Nan Xu, Yanqing Guo, Jiujun Wang, Xiangyang Luo, Ran He
In this method, we use the subspace representations of different views to adaptively learn a consensus similarity matrix, uncovering the subspace structure and avoiding noisy nature of original data.
no code implementations • 12 Sep 2017 • Yi Li, Lingxiao Song, Xiang Wu, Ran He, Tieniu Tan
This paper proposes a learning from generation approach for makeup-invariant face verification by introducing a bi-level adversarial network (BLAN).
no code implementations • 8 Aug 2017 • Ran He, Xiang Wu, Zhenan Sun, Tieniu Tan
To avoid the over-fitting problem on small-scale heterogeneous face data, a correlation prior is introduced on the fully-connected layers of WCNN network to reduce parameter space.
Ranked #3 on
Face Verification
on BUAA-VisNir
no code implementations • 15 Jun 2017 • Zhihe Lu, Zhihang Li, Jie Cao, Ran He, Zhenan Sun
Face synthesis has been a fascinating yet challenging problem in computer vision and machine learning.
no code implementations • NeurIPS 2017 • Qi Li, Zhenan Sun, Ran He, Tieniu Tan
Benefit from recent advances in deep learning, deep hashing methods have achieved promising results for image retrieval.
no code implementations • 22 May 2017 • Yanbo Fan, Jian Liang, Ran He, Bao-Gang Hu, Siwei Lyu
In multi-view clustering, different views may have different confidence levels when learning a consensus representation.
3 code implementations • ICCV 2017 • Rui Huang, Shu Zhang, Tianyu Li, Ran He
This paper proposes a Two-Pathway Generative Adversarial Network (TP-GAN) for photorealistic frontal view synthesis by simultaneously perceiving global structures and local details.
no code implementations • 12 Apr 2017 • Yibo Hu, Xiang Wu, Ran He
In this paper, we propose a novel Attention-Set based Metric Learning (ASML) method to measure the statistical characteristics of image sets.
no code implementations • 8 Apr 2017 • Xiang Wu, Lingxiao Song, Ran He, Tieniu Tan
CDL seeks a shared feature space in which the heterogeneous face matching problem can be approximately treated as a homogeneous face matching problem.
no code implementations • 16 Nov 2016 • Shu Zhang, Ran He, Tieniu Tan
The occlusions incurred by random meshes severely degenerate the performance of face verification systems, which raises the MeshFace verification problem between MeshFace and daily photos.
no code implementations • 1 Jun 2016 • Yanbo Fan, Ran He, Jian Liang, Bao-Gang Hu
In this paper, we focus on the minimizer function, and study a group of new regularizer, named self-paced implicit regularizer that is deduced from robust loss function.
no code implementations • 18 Apr 2016 • Linlin Cao, Ran He, Bao-Gang Hu
A new method called locally imposing function (LIF) is proposed to provide a local correction to the GCNN prediction function, which therefore falls within Locally Imposing Scheme (LIS).
no code implementations • 18 Apr 2016 • Yueying Kao, Ran He, Kaiqi Huang
Human beings often assess the aesthetic quality of an image coupled with the identification of the image's semantic content.
Ranked #5 on
Aesthetics Quality Assessment
on AVA
19 code implementations • 9 Nov 2015 • Xiang Wu, Ran He, Zhenan Sun, Tieniu Tan
This paper presents a Light CNN framework to learn a compact embedding on the large-scale face data with massive noisy labels.
Ranked #2 on
Age-Invariant Face Recognition
on CAFR
no code implementations • 9 Jul 2015 • Ran He, Tieniu Tan, Larry Davis, Zhenan Sun
This paper presents a structured ordinal measure method for video-based face recognition that simultaneously learns ordinal filters and structured ordinal features.
no code implementations • 28 Nov 2014 • Ran He, Man Zhang, Liang Wang, Ye Ji, Qiyue Yin
For unsupervised learning, we propose a cross-modal subspace clustering method to learn a common structure for different modalities.
no code implementations • 10 Jul 2011 • Bao-Gang Hu, Ran He, Xiaotong Yuan
This work presents a systematic study of objective evaluations of abstaining classifications using Information-Theoretic Measures (ITMs).