no code implementations • 28 Jan 2025 • Mingyuan Li, Tong Jia, Hui Lu, Bowen Ma, Hao Wang, Dongyue Chen
Prohibited item detection based on X-ray images is one of the most effective security inspection methods.
no code implementations • Proceedings of the 32nd ACM International Conference on Multimedia 2024 • Hu Gao, Jing Yang, Ying Zhang, Jingfan Yang, Bowen Ma, Depeng Dang
RWBlock empowers LSNB with the capability to explore various combination patterns of pairwise NAFBlocks by adaptive re-weighting of feature.
1 code implementation • 11 Sep 2024 • Jingfan Yang, Hu Gao, Ying Zhang, Bowen Ma, Depeng Dang
We present AFFB and utilize an improved Fast Fourier block to extract repetitive periodic features and long-range information in noisy spacecraft image.
1 code implementation • 22 Jul 2024 • Hanwei Liu, Rudong An, Zhimeng Zhang, Bowen Ma, Wei zhang, Yan Song, Yujing Hu, Wei Chen, Yu Ding
First, the carefully designed normalization network struggles to directly remove the above task-irrelevant noise, by maintaining facial expression consistency but normalizing all original images to a common identity with consistent pose, and background.
Ranked #1 on
Facial Expression Recognition (FER)
on DISFA
no code implementations • 16 Jun 2024 • Shuyang Lin, Tong Jia, Hao Wang, Bowen Ma, Mingyuan Li, Dongyue Chen
To address aforementioned challenges, in this paper, we introduce distillation-based open-vocabulary object detection (OVOD) task into X-ray security inspection domain by extending CLIP to learn visual representations in our specific X-ray domain, aiming to detect novel prohibited item categories beyond base categories on which the detector is trained.
1 code implementation • 5 Jun 2024 • Mingyuan Li, Tong Jia, Hui Lu, Bowen Ma, Hao Wang, Dongyue Chen
Prohibited Item detection in X-ray images is one of the most effective security inspection methods. However, differing from natural light images, the unique overlapping phenomena in X-ray images lead to the coupling of foreground and background features, thereby lowering the accuracy of general object detectors. Therefore, we propose a Multi-Class Min-Margin Contrastive Learning (MMCL) method that, by clarifying the category semantic information of content queries under the deformable DETR architecture, aids the model in extracting specific category foreground information from coupled features. Specifically, after grouping content queries by the number of categories, we employ the Multi-Class Inter-Class Exclusion (MIE) loss to push apart content queries from different groups.
1 code implementation • 19 May 2024 • Hu Gao, Bowen Ma, Ying Zhang, Jingfan Yang, Jing Yang, Depeng Dang
SFAM consists of two modules: the spatial domain attention module (SDAM) and the frequency domain attention module (FDAM).
1 code implementation • 7 Mar 2024 • Mingyuan Li, Tong Jia, Hao Wang, Bowen Ma, Shuyang Lin, Da Cai, Dongyue Chen
Considering the significant overlapping phenomenon in X-ray prohibited item images, we propose an Anti-Overlapping DETR (AO-DETR) based on one of the state-of-the-art general object detectors, DINO.
no code implementations • CVPR 2024 • Renshuai Liu, Bowen Ma, Wei zhang, Zhipeng Hu, Changjie Fan, Tangjie Lv, Yu Ding, Xuan Cheng
We devise a novel diffusion model that can undertake the task of simultaneously face swapping and reenactment.
no code implementations • 22 Jun 2023 • Yu Zhang, Hao Zeng, Bowen Ma, Wei zhang, Zhimeng Zhang, Yu Ding, Tangjie Lv, Changjie Fan
The discriminator is shape-aware and relies on a semantic flow-guided operation to explicitly calculate the shape discrepancies between the target and source faces, thus optimizing the face swapping network to generate highly realistic results.
no code implementations • 1 Apr 2023 • Yifeng Ma, Suzhen Wang, Yu Ding, Bowen Ma, Tangjie Lv, Changjie Fan, Zhipeng Hu, Zhidong Deng, Xin Yu
Leveraging the proposed dataset, we introduce a CLIP-based style encoder that projects natural language-based descriptions to the representations of expressions.
2D Semantic Segmentation task 3 (25 classes)
Talking Head Generation
no code implementations • 20 Mar 2023 • Wei zhang, Bowen Ma, Feng Qiu, Yu Ding
The CVPR 2023 Competition on Affective Behavior Analysis in-the-wild (ABAW) is dedicated to providing high-quality and large-scale Aff-wild2 for the recognition of commonly used emotion representations, such as Action Units (AU), basic expression categories(EXPR), and Valence-Arousal (VA).
no code implementations • 6 Dec 2022 • Hao Zeng, Wei zhang, Changjie Fan, Tangjie Lv, Suzhen Wang, Zhimeng Zhang, Bowen Ma, Lincheng Li, Yu Ding, Xin Yu
Unlike most previous methods that focus on transferring the source inner facial features but neglect facial contours, our FlowFace can transfer both of them to a target face, thus leading to more realistic face swapping.
no code implementations • 28 Oct 2022 • Bowen Ma, Rudong An, Wei zhang, Yu Ding, Zeng Zhao, Rongsheng Zhang, Tangjie Lv, Changjie Fan, Zhipeng Hu
As a fine-grained and local expression behavior measurement, facial action unit (FAU) analysis (e. g., detection and intensity estimation) has been documented for its time-consuming, labor-intensive, and error-prone annotation.
no code implementations • 23 Mar 2022 • Wei zhang, Feng Qiu, Suzhen Wang, Hao Zeng, Zhimeng Zhang, Rudong An, Bowen Ma, Yu Ding
Then, we introduce a transformer-based fusion module that integrates the static vision features and the dynamic multimodal features.
1 code implementation • 28 Nov 2021 • Bowen Ma, Chengzhi Zhang, Yuzhuo Wang, Sanhong Deng
In the research on identifying the structure function of chapters in academic articles, only a few studies used the deep learning model and explored the optimization for feature input.