no code implementations • 10 May 2025 • Dawei Huang, Qing Li, Chuan Yan, Zebang Cheng, Yurong Huang, Xiang Li, Bin Li, Xiaohui Wang, Zheng Lian, Xiaojiang Peng
While Large Multimodal Models (LMMs) have demonstrated significant progress in general vision-language (VL) tasks, their performance in emotion-specific scenarios remains limited.
1 code implementation • 10 Apr 2025 • Yuxiang Lin, Jingdong Sun, Zhi-Qi Cheng, Jue Wang, Haomin Liang, Zebang Cheng, Yifei Dong, Jun-Yan He, Xiaojiang Peng, Xian-Sheng Hua
Most existing emotion analysis emphasizes which emotion arises (e. g., happy, sad, angry) but neglects the deeper why.
no code implementations • 16 Mar 2025 • Zhaopan Xu, Pengfei Zhou, Jiaxin Ai, Wangbo Zhao, Kai Wang, Xiaojiang Peng, Wenqi Shao, Hongxun Yao, Kaipeng Zhang
Reasoning is an essential capacity for large language models (LLMs) to address complex tasks, where the identification of process errors is vital for improving this ability.
no code implementations • 16 Mar 2025 • Zhaopan Xu, Pengfei Zhou, Weidong Tang, Jiaxin Ai, Wangbo Zhao, Xiaojiang Peng, Kai Wang, Yang You, Wenqi Shao, Hongxun Yao, Kaipeng Zhang
In recent years, Multimodal Large Language Models (MLLMs) have demonstrated remarkable advancements in tasks such as visual question answering, visual understanding, and reasoning.
1 code implementation • 17 Dec 2024 • Mingjia Shi, Yuhao Zhou, Ruiji Yu, Zekai Li, Zhiyuan Liang, Xuanlei Zhao, Xiaojiang Peng, Tanmay Rajpurohit, Shanmukha Ramakrishna Vedantam, Wangbo Zhao, Kai Wang, Yang You
Re-training the token-reduced model enhances the performance of Mamba, by effectively rebuilding the key knowledge.
no code implementations • 12 Nov 2024 • Yilun Zheng, Xiang Li, Sitao Luan, Xiaojiang Peng, Lihui Chen
In prior studies, to assess the impacts of graph convolution on features, people proposed metrics based on feature homophily to measure feature consistency with the graph topology.
no code implementations • 12 Nov 2024 • Yilun Zheng, Zhuofan Zhang, ZiMing Wang, Xiang Li, Sitao Luan, Xiaojiang Peng, Lihui Chen
Surprisingly, our empirical observations and theoretical analysis show that no matter which type of graph structure construction methods are used, after feeding the same GSL bases to the newly constructed graph, there is no MI gain compared to the original GSL bases.
1 code implementation • 8 Sep 2024 • Xinran Li, Xiaomao Fan, Qingyang Wu, Xiaojiang Peng, Ye Li
MaTAV is with the advantages of aligning unimodal features to ensure consistency across different modalities and handling long input sequences to better capture contextual multimodal information.
no code implementations • 2 Sep 2024 • Xiaolong Wang, Zhi-Qi Cheng, Jue Wang, Xiaojiang Peng
To address these challenges, we introduce a new multimodal fashion image editing architecture based on latent diffusion models, called Detail-Preserved Diffusion Models (DPDEdit).
no code implementations • 1 Sep 2024 • Fuqiang Niu, Zebang Cheng, Xianghua Fu, Xiaojiang Peng, Genan Dai, Yin Chen, Hu Huang, BoWen Zhang
To address this, we introduce a new multimodal multi-turn conversational stance detection dataset (called MmMtCSD).
1 code implementation • 22 Aug 2024 • Jue Wang, Yuxiang Lin, Tianshuo Yuan, Zhi-Qi Cheng, Xiaolong Wang, Jiao GH, Wei Chen, Xiaojiang Peng
Our approach employs a VLLM in comprehending the image content, mask, and user instructions.
1 code implementation • 20 Aug 2024 • Zebang Cheng, Shuyuan Tu, Dawei Huang, Minghan Li, Xiaojiang Peng, Zhi-Qi Cheng, Alexander G. Hauptmann
This paper presents our winning approach for the MER-NOISE and MER-OV tracks of the MER2024 Challenge on multimodal emotion recognition.
1 code implementation • 4 Jul 2024 • Jinsong Shi, Pan Gao, Xiaojiang Peng, Jie Qin
It applies cut and mix operations to diverse categories of synthetic distorted images, assigning confidence scores to class labels based on the aforementioned prior knowledge.
1 code implementation • 17 Jun 2024 • Zebang Cheng, Zhi-Qi Cheng, Jun-Yan He, Jingdong Sun, Kai Wang, Yuxiang Lin, Zheng Lian, Xiaojiang Peng, Alexander Hauptmann
Accurate emotion perception is crucial for various applications, including human-computer interaction, education, and counseling.
1 code implementation • 28 May 2024 • Ziheng Qin, Zhaopan Xu, Yukun Zhou, Zangwei Zheng, Zebang Cheng, Hao Tang, Lei Shang, Baigui Sun, Xiaojiang Peng, Radu Timofte, Hongxun Yao, Kai Wang, Yang You
To tackle this challenge, we propose InfoGrowth, an efficient online algorithm for data cleaning and selection, resulting in a growing dataset that keeps up to date with awareness of cleanliness and diversity.
2 code implementations • 27 May 2024 • Kai Wang, Mingjia Shi, Yukun Zhou, Zekai Li, Zhihang Yuan, Yuzhang Shang, Xiaojiang Peng, Hanwang Zhang, Yang You
Training diffusion models is always a computation-intensive task.
1 code implementation • 29 Apr 2024 • Zhi-Qi Cheng, Xiang Li, Jun-Yan He, Junyao Chen, Xiaomao Fan, Xiaojiang Peng, Alexander G. Hauptmann
Emotional Text-to-Speech (E-TTS) synthesis has garnered significant attention in recent years due to its potential to revolutionize human-computer interaction.
no code implementations • 26 Apr 2024 • Xinpeng Li, Teng Wang, Jian Zhao, Shuyi Mao, Jinbao Wang, Feng Zheng, Xiaojiang Peng, Xuelong Li
Emotion recognition aims to discern the emotional state of subjects within an image, relying on subject-centric and contextual visual cues.
1 code implementation • 23 Apr 2024 • Fan Zhang, Zhi-Qi Cheng, Jian Zhao, Xiaojiang Peng, Xuelong Li
LEAF introduces a hierarchical expression-aware aggregation strategy that operates at three levels: semantic, instance, and category.
Facial Expression Recognition
Facial Expression Recognition (FER)
1 code implementation • 31 Mar 2024 • Zebang Cheng, Fuqiang Niu, Yuxiang Lin, Zhi-Qi Cheng, BoWen Zhang, Xiaojiang Peng
This paper presents our winning submission to Subtask 2 of SemEval 2024 Task 3 on multimodal emotion cause analysis in conversations.
1 code implementation • 26 Mar 2024 • Jue Wang, Yuxiang Lin, Qi Zhao, Dong Luo, Shuaibao Chen, Wei Chen, Xiaojiang Peng
The widespread use of various chemical gases in industrial processes necessitates effective measures to prevent their leakage during transportation and storage, given their high toxicity.
1 code implementation • 17 Mar 2024 • Fuqiang Niu, Min Yang, Ang Li, Baoquan Zhang, Xiaojiang Peng, BoWen Zhang
Previous stance detection studies typically concentrate on evaluating stances within individual instances, thereby exhibiting limitations in effectively modeling multi-party discussions concerning the same specific topic, as naturally transpire in authentic social media interactions.
no code implementations • 22 Feb 2024 • Yifan Duan, Guibin Zhang, Shilong Wang, Xiaojiang Peng, Wang Ziqi, Junyuan Mao, Hao Wu, Xinke Jiang, Kun Wang
Credit card fraud poses a significant threat to the economy.
no code implementations • 14 Jan 2024 • Fan Zhang, Shuyi Mao, Qing Li, Xiaojiang Peng
Comparative evaluations with popular point-based methods on HPoint103 and the public dataset DHP19 demonstrate the dramatic outperformance of our D-CPT.
no code implementations • 14 Jan 2024 • Fan Zhang, Xiaobao Guo, Xiaojiang Peng, Alex Kot
In addition, when compared with the domain disparity existing between face datasets and FER datasets, the divergence between general datasets and FER datasets is more pronounced.
no code implementations • 19 Aug 2023 • Kun Wang, Guohao Li, Shilong Wang, Guibin Zhang, Kai Wang, Yang You, Xiaojiang Peng, Yuxuan Liang, Yang Wang
Despite Graph Neural Networks demonstrating considerable promise in graph representation learning tasks, GNNs predominantly face significant issues with over-fitting and over-smoothing as they go deeper as models of computer vision realm.
1 code implementation • 12 Apr 2023 • Xinpeng Li, Xiaojiang Peng
Inspired by the growth of lane detection, we propose a rail database and a row-based rail detection method.
1 code implementation • 6 Dec 2022 • Lihua Fu, Haoyue Tian, Xiangping Bryce Zhai, Pan Gao, Xiaojiang Peng
Semantic segmentation usually benefits from global contexts, fine localisation information, multi-scale features, etc.
Ranked #160 on
Image Classification
on ImageNet
(GFLOPs metric)
no code implementations • 12 Nov 2022 • Shuyi Mao, Xinpeng Li, Qingyang Wu, Xiaojiang Peng
Studies have proven that domain bias and label bias exist in different Facial Expression Recognition (FER) datasets, making it hard to improve the performance of a specific dataset by adding other datasets.
1 code implementation • 20 Jul 2022 • Shuyi Mao, Xinpeng Li, Junyao Chen, Xiaojiang Peng
In Learing from Synthetic Data(LSD) task, facial expression recognition (FER) methods aim to learn the representation of expression from the artificially generated data and generalise to real data.
no code implementations • 8 Jul 2022 • Xiaojiang Peng, Xiaomao Fan, Qingyang Wu, Jieyan Zhao, Pan Gao
Moreover, we present a new Coarse-to-fine Deep Smoky vehicle detection (CoDeS) framework for efficient smoky vehicle detection.
1 code implementation • 25 Apr 2022 • Haoyue Tian, Pan Gao, Xiaojiang Peng
In order to solve this problem, we revisit the deformable convolution for video interpolation, which can break the fixed grid restrictions on the kernel region, making the distribution of reference points more suitable for the shape of the object, and thus warp a more accurate interpolation frame.
no code implementations • 28 Jan 2022 • Wei Xue, Xiaojiang Peng
Stereo matching is crucial for binocular stereo vision.
no code implementations • 10 Dec 2021 • Qing Li, Xiaojiang Peng, Chuan Yan, Pan Gao, Qi Hao
In SEN, a student network is kept in a collaborative manner with supervised learning and self-supervised learning, and a teacher network conducts temporal consistency to learn useful representations and ensure the quality of point clouds reconstruction.
1 code implementation • 12 Jul 2021 • Shuyi Mao, Xinqi Fan, Xiaojiang Peng
The paper describes our proposed methodology for the seven basic expression classification track of Affective Behavior Analysis in-the-wild (ABAW) Competition 2021.
1 code implementation • CVPR 2022 • Kai Wang, Shuo Wang, Panpan Zhang, Zhipeng Zhou, Zheng Zhu, Xiaobo Wang, Xiaojiang Peng, Baigui Sun, Hao Li, Yang You
This method adopts Dynamic Class Pool (DCP) for storing and updating the identities features dynamically, which could be regarded as a substitute for the FC layer.
Ranked #1 on
Face Verification
on IJB-C
(training dataset metric)
2 code implementations • CVPR 2021 • Zhi Hou, Baosheng Yu, Yu Qiao, Xiaojiang Peng, DaCheng Tao
The proposed method can thus be used to 1) improve the performance of HOI detection, especially for the HOIs with unseen objects; and 2) infer the affordances of novel objects.
Ranked #2 on
Affordance Recognition
on HICO-DET(Unknown Concepts)
1 code implementation • CVPR 2021 • Zhi Hou, Baosheng Yu, Yu Qiao, Xiaojiang Peng, DaCheng Tao
With the proposed object fabricator, we are able to generate large-scale HOI samples for rare and unseen categories to alleviate the open long-tailed issues in HOI detection.
Ranked #4 on
Affordance Recognition
on HICO-DET
no code implementations • 8 Mar 2021 • Qing Li, Xiaojiang Peng, Yu Qiao, Qi Hao
The multi-label learning module leverages a memory feature bank and assigns each image with a multi-label vector based on the similarities between the image and feature bank.
no code implementations • 27 Dec 2020 • Hengshun Zhou, Debin Meng, Yuanyuan Zhang, Xiaojiang Peng, Jun Du, Kai Wang, Yu Qiao
The audio-video based emotion recognition aims to classify a given video into basic emotions.
Facial Expression Recognition (FER)
Video Emotion Recognition
no code implementations • 18 Dec 2020 • Kai Wang, Yuxin Gu, Xiaojiang Peng, Panpan Zhang, Baigui Sun, Hao Li
The domain diversities including inconsistent annotation and varied image collection conditions inevitably exist among different facial expression recognition (FER) datasets, which pose an evident challenge for adapting the FER model trained on one dataset to another one.
Facial Expression Recognition
Facial Expression Recognition (FER)
+1
1 code implementation • ECCV 2020 • Jin Ye, Junjun He, Xiaojiang Peng, Wenhao Wu, Yu Qiao
To this end, we propose an Attention-Driven Dynamic Graph Convolutional Network (ADD-GCN) to dynamically generate a specific graph for each image.
Ranked #24 on
Multi-Label Classification
on MS-COCO
1 code implementation • ECCV 2020 • Xiaojiang Peng, Kai Wang, Zhaoyang Zeng, Qing Li, Jianfei Yang, Yu Qiao
Specifically, this plug-and-play AFM first leverages a \textit{group-to-attend} module to construct groups and assign attention weights for group-wise samples, and then uses a \textit{mixup} module with the attention weights to interpolate massive noisy-suppressed samples.
4 code implementations • ECCV 2020 • Zhi Hou, Xiaojiang Peng, Yu Qiao, DaCheng Tao
The integration of decomposition and composition enables VCL to share object and verb features among different HOI samples and images, and to generate new interaction samples and new types of HOI, and thus largely alleviates the long-tail distribution problem and benefits low-shot or zero-shot HOI detection.
Ranked #3 on
Affordance Recognition
on HICO-DET(Unknown Concepts)
no code implementations • 7 Mar 2020 • Wen Wang, Xiaojiang Peng, Yanzhou Su, Yu Qiao, Jian Cheng
Video action anticipation aims to predict future action categories from observed frames.
2 code implementations • CVPR 2020 • Kai Wang, Xiaojiang Peng, Jianfei Yang, Shijian Lu, Yu Qiao
Annotating a qualitative large-scale facial expression dataset is extremely difficult due to the uncertainties caused by ambiguous facial expressions, low-quality facial images, and the subjectiveness of annotators.
Facial Expression Recognition
Facial Expression Recognition (FER)
1 code implementation • 21 Jan 2020 • Wen Wang, Xiaojiang Peng, Yu Qiao, Jian Cheng
Online action detection (OAD) is a practical yet challenging task, which has attracted increasing attention in recent years.
no code implementations • 28 Sep 2019 • Qing Li, Xiaojiang Peng, Yu Qiao, Qiang Peng
In this paper, instead of using a pre-defined graph which is inflexible and may be sub-optimal for multi-label classification, we propose the A-GCN, which leverages the popular Graph Convolutional Networks with an Adaptive label correlation graph to model label dependencies.
no code implementations • 26 Jul 2019 • Qing Li, Xiaojiang Peng, Liangliang Cao, Wenbin Du, Hao Xing, Yu Qiao
Instead of collecting product images by labor-and time-intensive image capturing, we take advantage of the web and download images from the reviews of several e-commerce websites where the images are casually captured by consumers.
no code implementations • 8 Jul 2019 • Kai Wang, Jianfei Yang, Da Guo, Kaipeng Zhang, Xiaojiang Peng, Yu Qiao
Based on our winner solution last year, we mainly explore head features and body features with a bootstrap strategy and two novel loss functions in this paper.
2 code implementations • 29 Jun 2019 • Debin Meng, Xiaojiang Peng, Kai Wang, Yu Qiao
The feature embedding module is a deep Convolutional Neural Network (CNN) which embeds face images into feature vectors.
Ranked #3 on
Facial Expression Recognition (FER)
on CK+
(Accuracy (7 emotion) metric)
Facial Expression Recognition
Facial Expression Recognition (FER)
1 code implementation • 10 May 2019 • Kai Wang, Xiaojiang Peng, Jianfei Yang, Debin Meng, Yu Qiao
Extensive experiments show that our RAN and region biased loss largely improve the performance of FER with occlusion and variant pose.
Ranked #2 on
Facial Expression Recognition (FER)
on SFEW
Facial Expression Recognition
Facial Expression Recognition (FER)
no code implementations • European Conference on Computer Vision (ECVV 2016) 2016 • Xiaojiang Peng, Cordelia Schmid
We propose a multi-region two-stream R-CNN model for action detection in realistic videos.
Ranked #2 on
Action Detection
on UCF Sports
no code implementations • 21 Mar 2016 • Guosheng Hu, Xiaojiang Peng, Yongxin Yang, Timothy Hospedales, Jakob Verbeek
To train such networks, very large training sets are needed with millions of labeled images.
no code implementations • CVPR 2014 • Zhuowei Cai, Li-Min Wang, Xiaojiang Peng, Yu Qiao
Kernel average is then applied on these components to produce recognition result.
no code implementations • 18 May 2014 • Xiaojiang Peng, Li-Min Wang, Xingxing Wang, Yu Qiao
Many efforts have been made in each step independently in different scenarios and their effect on action recognition is still unknown.
no code implementations • 2 Sep 2013 • Xiaojiang Peng, Qiang Peng, Yu Qiao, Junzhou Chen, Mehtab Afzal
Many efforts have been devoted to develop alternative methods to traditional vector quantization in image domain such as sparse coding and soft-assignment.