Search Results for author: Zitong Yu

Found 68 papers, 31 papers with code

FusionMamba: Dynamic Feature Enhancement for Multimodal Image Fusion with Mamba

no code implementations15 Apr 2024 Xinyu Xie, Yawen Cui, Chio-in Ieong, Tao Tan, Xiaozhi Zhang, Xubin Zheng, Zitong Yu

In this paper, we propose FusionMamba, a novel dynamic feature enhancement method for multimodal image fusion with Mamba.

Infrared And Visible Image Fusion

Multi-modal Document Presentation Attack Detection With Forensics Trace Disentanglement

no code implementations10 Apr 2024 Changsheng chen, Yongyi Deng, Liangwei Lin, Zitong Yu, Zhimao Lai

Document Presentation Attack Detection (DPAD) is an important measure in protecting the authenticity of a document image.


Safeguarding Medical Image Segmentation Datasets against Unauthorized Training via Contour- and Texture-Aware Perturbations

no code implementations21 Mar 2024 Xun Lin, Yi Yu, Song Xia, Jue Jiang, Haoran Wang, Zitong Yu, Yizhong Liu, Ying Fu, Shuai Wang, Wenzhong Tang, Alex Kot

This is particularly true for medical image segmentation (MIS) datasets, where the processes of collection and fine-grained annotation are time-intensive and laborious.

Image Classification Image Generation +4

Answering Diverse Questions via Text Attached with Key Audio-Visual Clues

no code implementations11 Mar 2024 Qilang Ye, Zitong Yu, Xin Liu

Audio-visual question answering (AVQA) requires reference to video content and auditory information, followed by correlating the question to predict the most precise answer.

Audio-visual Question Answering Audio-Visual Question Answering (AVQA) +3

AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors

1 code implementation7 Mar 2024 Kaishen Yuan, Zitong Yu, Xin Liu, Weicheng Xie, Huanjing Yue, Jingyu Yang

Facial Action Units (AU) is a vital concept in the realm of affective computing, and AU detection has always been a hot research topic.

Facial Action Unit Detection Transfer Learning

A Simple yet Effective Network based on Vision Transformer for Camouflaged Object and Salient Object Detection

no code implementations29 Feb 2024 Chao Hao, Zitong Yu, Xin Liu, Jun Xu, Huanjing Yue, Jingyu Yang

Camouflaged object detection (COD) and salient object detection (SOD) are two distinct yet closely-related computer vision tasks widely studied during the past decades.

Object object-detection +2

GenFace: A Large-Scale Fine-Grained Face Forgery Benchmark and Cross Appearance-Edge Learning

no code implementations3 Feb 2024 Yaning Zhang, Zitong Yu, Xiaobin Huang, Linlin Shen, Jianfeng Ren

In this paper, we propose a large-scale, diverse, and fine-grained high-fidelity dataset, namely GenFace, to facilitate the advancement of deepfake detection, which contains a large number of forgery faces generated by advanced generators such as the diffusion-based model and more detailed labels about the manipulation approaches and adopted generators.

Benchmarking DeepFake Detection +1

Exposing Image Splicing Traces in Scientific Publications via Uncertainty-guided Refinement

1 code implementation28 Sep 2023 Xun Lin, Wenzhong Tang, Haoran Wang, Yizhong Liu, Yakun Ju, Shuai Wang, Zitong Yu

Compared to image duplication and synthesis, image splicing detection is more challenging due to the lack of reference images and the typically small tampered areas.

Image Forensics Image Generation +1

Hyperbolic Face Anti-Spoofing

no code implementations17 Aug 2023 Shuangpeng Han, Rizhao Cai, Yawen Cui, Zitong Yu, Yongjian Hu, Alex Kot

To further improve generalization, we conduct hyperbolic contrastive learning for the bonafide only while relaxing the constraints on diverse spoofing attacks.

Contrastive Learning Face Anti-Spoofing +1

Visual Prompt Flexible-Modal Face Anti-Spoofing

no code implementations26 Jul 2023 Zitong Yu, Rizhao Cai, Yawen Cui, Ajian Liu, Changsheng chen

Recently, vision transformer based multimodal learning methods have been proposed to improve the robustness of face anti-spoofing (FAS) systems.

Face Anti-Spoofing

Detect Any Deepfakes: Segment Anything Meets Face Forgery Detection and Localization

1 code implementation29 Jun 2023 Yingxin Lai, Zhiming Luo, Zitong Yu

The rapid advancements in computer vision have stimulated remarkable progress in face forgery techniques, capturing the dedicated attention of researchers committed to detecting forgeries and precisely localizing manipulated areas.

DeepFake Detection Face Swapping

DEMIST: A deep-learning-based task-specific denoising approach for myocardial perfusion SPECT

no code implementations7 Jun 2023 Md Ashequr Rahman, Zitong Yu, Richard Laforest, Craig K. Abbey, Barry A. Siegel, Abhinav K. Jha

There is an important need for methods to process myocardial perfusion imaging (MPI) SPECT images acquired at lower radiation dose and/or acquisition time such that the processed images improve observer performance on the clinical task of detecting perfusion defects.


rPPG-MAE: Self-supervised Pre-training with Masked Autoencoders for Remote Physiological Measurement

1 code implementation4 Jun 2023 Xin Liu, Yuting Zhang, Zitong Yu, Hao Lu, Huanjing Yue, Jingyu Yang

However, they focus on the contrastive learning between samples, which neglect the inherent self-similar prior in physiological signals and seem to have a limited ability to cope with noisy.

Contrastive Learning

Rehearsal-Free Domain Continual Face Anti-Spoofing: Generalize More and Forget Less

no code implementations ICCV 2023 Rizhao Cai, Yawen Cui, Zhi Li, Zitong Yu, Haoliang Li, Yongjian Hu, Alex Kot

To alleviate the forgetting of previous domains without using previous data, we propose the Proxy Prototype Contrastive Regularization (PPCR) to constrain the continual learning with previous domain knowledge from the proxy prototypes.

Continual Learning Domain Generalization +1

Neuron Structure Modeling for Generalizable Remote Physiological Measurement

1 code implementation CVPR 2023 Hao Lu, Zitong Yu, Xuesong Niu, Yingcong Chen

We show that most domain generalization methods do not work well in this problem, as domain labels are ambiguous in complicated environmental changes.

Domain Generalization

Audio-Visual Deception Detection: DOLOS Dataset and Parameter-Efficient Crossmodal Learning

1 code implementation ICCV 2023 Xiaobao Guo, Nithish Muthuchamy Selvaraj, Zitong Yu, Adams Wai-Kin Kong, Bingquan Shen, Alex Kot

Despite this, deception detection research is hindered by the lack of high-quality deception datasets, as well as the difficulties of learning multimodal features effectively.

Deception Detection Multi-Task Learning

Need for Objective Task-based Evaluation of Deep Learning-Based Denoising Methods: A Study in the Context of Myocardial Perfusion SPECT

no code implementations3 Mar 2023 Zitong Yu, Md Ashequr Rahman, Richard Laforest, Thomas H. Schindler, Robert J. Gropler, Richard L. Wahl, Barry A. Siegel, Abhinav K. Jha

Our objectives were to (1) investigate whether evaluation with these FoMs is consistent with objective clinical-task-based evaluation; (2) provide a theoretical analysis for determining the impact of denoising on signal-detection tasks; (3) demonstrate the utility of virtual clinical trials (VCTs) to evaluate DL-based methods.

Denoising SSIM

A task-specific deep-learning-based denoising approach for myocardial perfusion SPECT

no code implementations1 Mar 2023 Md Ashequr Rahman, Zitong Yu, Barry A. Siegel, Abhinav K. Jha

However, while promising, studies have shown that these methods may have limited impact on the performance of clinical tasks in SPECT.


Generalized Few-Shot Continual Learning with Contrastive Mixture of Adapters

1 code implementation12 Feb 2023 Yawen Cui, Zitong Yu, Rizhao Cai, Xun Wang, Alex C. Kot, Li Liu

The goal of Few-Shot Continual Learning (FSCL) is to incrementally learn novel tasks with limited labeled samples and preserve previous capabilities simultaneously, while current FSCL methods are all for the class-incremental purpose.

Continual Learning Contrastive Learning +2

Flexible-modal Deception Detection with Audio-Visual Adapter

no code implementations11 Feb 2023 Zhaoxu Li, Zitong Yu, Nithish Muthuchamy Selvaraj, Xiaobao Guo, Bingquan Shen, Adams Wai-Kin Kong, Alex Kot

Detecting deception by human behaviors is vital in many fields such as custom security and multimedia anti-fraud.

Deception Detection

Rethinking Vision Transformer and Masked Autoencoder in Multimodal Face Anti-Spoofing

no code implementations11 Feb 2023 Zitong Yu, Rizhao Cai, Yawen Cui, Xin Liu, Yongjian Hu, Alex Kot

In this paper, we investigate three key factors (i. e., inputs, pre-training, and finetuning) in ViT for multimodal FAS with RGB, Infrared (IR), and Depth.

Face Anti-Spoofing

PhysFormer++: Facial Video-based Physiological Measurement with SlowFast Temporal Difference Transformer

no code implementations7 Feb 2023 Zitong Yu, Yuming Shen, Jingang Shi, Hengshuang Zhao, Yawen Cui, Jiehua Zhang, Philip Torr, Guoying Zhao

As key modules in PhysFormer, the temporal difference transformers first enhance the quasi-periodic rPPG features with temporal difference guided global attention, and then refine the local spatio-temporal representation against interference.

Face Presentation Attack Detection

no code implementations7 Dec 2022 Zitong Yu, Chenxu Zhao, Zhen Lei

Face recognition technology has been widely used in daily interactive applications such as checking-in and mobile payment due to its convenience and high accuracy.

Face Anti-Spoofing Face Presentation Attack Detection +1

Learning Motion-Robust Remote Photoplethysmography through Arbitrary Resolution Videos

1 code implementation30 Nov 2022 Jianwei Li, Zitong Yu, Jingang Shi

Remote photoplethysmography (rPPG) enables non-contact heart rate (HR) estimation from facial videos which gives significant convenience compared with traditional contact-based measurements.

Face Alignment Optical Flow Estimation

Boosting Binary Neural Networks via Dynamic Thresholds Learning

no code implementations4 Nov 2022 Jiehua Zhang, Xueyang Zhang, Zhuo Su, Zitong Yu, Yanghe Feng, Xin Lu, Matti Pietikäinen, Li Liu

For ViTs, DyBinaryCCT presents the superiority of the convolutional embedding layer in fully binarized ViTs and achieves 56. 1% on the ImageNet dataset, which is nearly 9% higher than the baseline.


Forensicability Assessment of Questioned Images in Recapturing Detection

no code implementations5 Sep 2022 Changsheng chen, Lin Zhao, Rizhao Cai, Zitong Yu, Jiwu Huang, Alex C. Kot

We integrate the trained FANet with practical recapturing detection schemes in face anti-spoofing and recaptured document detection tasks.

Face Anti-Spoofing Image Quality Assessment

Benchmarking Joint Face Spoofing and Forgery Detection with Visual and Physiological Cues

no code implementations10 Aug 2022 Zitong Yu, Rizhao Cai, Zhi Li, Wenhan Yang, Jingang Shi, Alex C. Kot

In this paper, we establish the first joint face spoofing and forgery detection benchmark using both visual appearance and physiological rPPG cues.

Benchmarking DeepFake Detection +3

Rethinking Few-Shot Class-Incremental Learning with Open-Set Hypothesis in Hyperbolic Geometry

no code implementations20 Jul 2022 Yawen Cui, Zitong Yu, Wei Peng, Li Liu

Few-Shot Class-Incremental Learning (FSCIL) aims at incrementally learning novel classes from a few labeled samples by avoiding the overfitting and catastrophic forgetting simultaneously.

Few-Shot Class-Incremental Learning Incremental Learning +2

Domain Generalization via Shuffled Style Assembly for Face Anti-Spoofing

1 code implementation CVPR 2022 Zhuo Wang, Zezheng Wang, Zitong Yu, Weihong Deng, Jiahong Li, Tingting Gao, Zhongyuan Wang

A novel Shuffled Style Assembly Network (SSAN) is proposed to extract and reassemble different content and style features for a stylized feature space.

Contrastive Learning Domain Generalization +1

ViTransPAD: Video Transformer using convolution and self-attention for Face Presentation Attack Detection

no code implementations3 Mar 2022 Zuheng Ming, Zitong Yu, Musab Al-Ghadi, Muriel Visani, Muhammad MuzzamilLuqman, Jean-Christophe Burie

Instead of using coarse image patches with single-scale as in ViT, we propose the Multi-scale Multi-Head Self-Attention (MsMHSA) architecture to accommodate multi-scale patch partitions of Q, K, V feature maps to the heads of transformer in a coarse-to-fine manner, which enables to learn a fine-grained representation to perform pixel-level discrimination for face PAD.

Binary Classification Face Presentation Attack Detection

Investigating the limited performance of a deep-learning-based SPECT denoising approach: An observer-study-based characterization

no code implementations3 Mar 2022 Zitong Yu, Md Ashequr Rahman, Abhinav K. Jha

To achieve this goal, we conducted a task-based characterization of a DL-based denoising approach for individual signal properties.


Review of Face Presentation Attack Detection Competitions

no code implementations21 Dec 2021 Zitong Yu, Jukka Komulainen, Xiaobai Li, Guoying Zhao

Face presentation attack detection (PAD) has received increasing attention ever since the vulnerabilities to spoofing have been widely recognized.

Face Anti-Spoofing Face Presentation Attack Detection

Geometry-Contrastive Transformer for Generalized 3D Pose Transfer

1 code implementation14 Dec 2021 Haoyu Chen, Hao Tang, Zitong Yu, Nicu Sebe, Guoying Zhao

Specifically, we propose a novel geometry-contrastive Transformer that has an efficient 3D structured perceiving ability to the global geometric inconsistencies across the given meshes.

Pose Transfer

PhysFormer: Facial Video-based Physiological Measurement with Temporal Difference Transformer

1 code implementation CVPR 2022 Zitong Yu, Yuming Shen, Jingang Shi, Hengshuang Zhao, Philip Torr, Guoying Zhao

Remote photoplethysmography (rPPG), which aims at measuring heart activities and physiological signals from facial video without any contact, has great potential in many applications (e. g., remote healthcare and affective computing).

Meta-Teacher For Face Anti-Spoofing

no code implementations12 Nov 2021 Yunxiao Qin, Zitong Yu, Longbin Yan, Zezheng Wang, Chenxu Zhao, Zhen Lei

The meta-teacher is trained in a bi-level optimization manner to learn the ability to supervise the PA detectors learning rich spoofing cues.

Face Anti-Spoofing Face Recognition

Pixel Difference Networks for Efficient Edge Detection

2 code implementations ICCV 2021 Zhuo Su, Wenzhe Liu, Zitong Yu, Dewen Hu, Qing Liao, Qi Tian, Matti Pietikäinen, Li Liu

A faster version of PiDiNet with less than 0. 1M parameters can still achieve comparable performance among state of the arts with 200 FPS.

Edge Detection

iMiGUE: An Identity-free Video Dataset for Micro-Gesture Understanding and Emotion Analysis

1 code implementation CVPR 2021 Xin Liu, Henglin Shi, Haoyu Chen, Zitong Yu, Xiaobai Li, Guoying Zhaoz?

We introduce a new dataset for the emotional artificial intelligence research: identity-free video dataset for Micro-Gesture Understanding and Emotion analysis (iMiGUE).

Emotion Recognition

Deep Learning for Face Anti-Spoofing: A Survey

3 code implementations28 Jun 2021 Zitong Yu, Yunxiao Qin, Xiaobai Li, Chenxu Zhao, Zhen Lei, Guoying Zhao

Face anti-spoofing (FAS) has lately attracted increasing attention due to its vital role in securing face recognition systems from presentation attacks (PAs).

Domain Generalization Face Anti-Spoofing +1

Non-contact Pain Recognition from Video Sequences with Remote Physiological Measurements Prediction

no code implementations18 May 2021 Ruijing Yang, Ziyu Guan, Zitong Yu, Xiaoyi Feng, Jinye Peng, Guoying Zhao

The framework is able to capture both local and long-range dependencies via the proposed attention mechanism for the learned appearance representations, which are further enriched by temporally attended physiological cues (remote photoplethysmography, rPPG) that are recovered from videos in the auxiliary task.

Medical Diagnosis Multi-Task Learning

Dual-Cross Central Difference Network for Face Anti-Spoofing

1 code implementation4 May 2021 Zitong Yu, Yunxiao Qin, Hengshuang Zhao, Xiaobai Li, Guoying Zhao

In this paper, we propose two Cross Central Difference Convolutions (C-CDC), which exploit the difference of the center and surround sparse local features from the horizontal/vertical and diagonal directions, respectively.

Face Anti-Spoofing Face Recognition

TransRPPG: Remote Photoplethysmography Transformer for 3D Mask Face Presentation Attack Detection

no code implementations15 Apr 2021 Zitong Yu, Xiaobai Li, Pichao Wang, Guoying Zhao

3D mask face presentation attack detection (PAD) plays a vital role in securing face recognition systems from emergent 3D mask attacks.

Face Presentation Attack Detection Face Recognition

A physics and learning-based transmission-less attenuation compensation method for SPECT

no code implementations10 Feb 2021 Zitong Yu, Md Ashequr Rahman, Thomas Schindler, Richard Laforest, Abhinav K. Jha

The proposed method uses data acquired in the scatter window to reconstruct an initial estimate of the attenuation map using a physics-based approach.

Medical Physics

Revisiting Pixel-Wise Supervision for Face Anti-Spoofing

1 code implementation24 Nov 2020 Zitong Yu, Xiaobai Li, Jingang Shi, Zhaoqiang Xia, Guoying Zhao

Face anti-spoofing (FAS) plays a vital role in securing face recognition systems from the presentation attacks (PAs).

Face Anti-Spoofing Face Recognition

2nd Place Scheme on Action Recognition Track of ECCV 2020 VIPriors Challenges: An Efficient Optical Flow Stream Guided Framework

no code implementations10 Aug 2020 Haoyu Chen, Zitong Yu, Xin Liu, Wei Peng, Yoon Lee, Guoying Zhao

To address the problem of training on small datasets for action recognition tasks, most prior works are either based on a large number of training samples or require pre-trained models transferred from other large datasets to tackle overfitting problems.

Action Recognition Optical Flow Estimation

Video-based Remote Physiological Measurement via Cross-verified Feature Disentangling

1 code implementation ECCV 2020 Xuesong Niu, Zitong Yu, Hu Han, Xiaobai Li, Shiguang Shan, Guoying Zhao

Remote physiological measurements, e. g., remote photoplethysmography (rPPG) based heart rate (HR), heart rate variability (HRV) and respiration frequency (RF) measuring, are playing more and more important roles under the application scenarios where contact measurement is inconvenient or impossible.

Heart Rate Variability

Face Anti-Spoofing with Human Material Perception

no code implementations ECCV 2020 Zitong Yu, Xiaobai Li, Xuesong Niu, Jingang Shi, Guoying Zhao

In this paper we rephrase face anti-spoofing as a material recognition problem and combine it with classical human material perception [1], intending to extract discriminative and robust features for FAS.

Face Anti-Spoofing Face Recognition +1

AutoHR: A Strong End-to-end Baseline for Remote Heart Rate Measurement with Neural Searching

no code implementations26 Apr 2020 Zitong Yu, Xiaobai Li, Xuesong Niu, Jingang Shi, Guoying Zhao

Remote photoplethysmography (rPPG), which aims at measuring heart activities without any contact, has great potential in many applications (e. g., remote healthcare).

Data Augmentation Neural Architecture Search +1

Multi-Modal Face Anti-Spoofing Based on Central Difference Networks

1 code implementation17 Apr 2020 Zitong Yu, Yunxiao Qin, Xiaobai Li, Zezheng Wang, Chenxu Zhao, Zhen Lei, Guoying Zhao

Face anti-spoofing (FAS) plays a vital role in securing face recognition systems from presentation attacks.

Face Anti-Spoofing Face Recognition

Searching Central Difference Convolutional Networks for Face Anti-Spoofing

5 code implementations CVPR 2020 Zitong Yu, Chenxu Zhao, Zezheng Wang, Yunxiao Qin, Zhuo Su, Xiaobai Li, Feng Zhou, Guoying Zhao

Here we propose a novel frame level FAS method based on Central Difference Convolution (CDC), which is able to capture intrinsic detailed patterns via aggregating both intensity and gradient information.

Face Anti-Spoofing Face Recognition +1

Remote Photoplethysmograph Signal Measurement from Facial Videos Using Spatio-Temporal Networks

2 code implementations7 May 2019 Zitong Yu, Xiaobai Li, Guoying Zhao

Recent studies demonstrated that the average heart rate (HR) can be measured from facial videos based on non-contact remote photoplethysmography (rPPG).

Emotion Recognition Heart Rate Variability +1

Pedestrian re-identification based on Tree branch network with local and global learning

no code implementations31 Mar 2019 Hui Li, Meng Yang, Zhihui Lai, Wei-Shi Zheng, Zitong Yu

Deep part-based methods in recent literature have revealed the great potential of learning local part-level representation for pedestrian image in the task of person re-identification.

Person Re-Identification

Cannot find the paper you are looking for? You can Submit a new open access paper.