Search Results for author: Xiaojiang Peng

Found 57 papers, 29 papers with code

Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language Understanding

no code implementations10 May 2025 Dawei Huang, Qing Li, Chuan Yan, Zebang Cheng, Yurong Huang, Xiang Li, Bin Li, Xiaohui Wang, Zheng Lian, Xiaojiang Peng

While Large Multimodal Models (LMMs) have demonstrated significant progress in general vision-language (VL) tasks, their performance in emotion-specific scenarios remains limited.

Descriptive Emotion Recognition +1

MPBench: A Comprehensive Multimodal Reasoning Benchmark for Process Errors Identification

no code implementations16 Mar 2025 Zhaopan Xu, Pengfei Zhou, Jiaxin Ai, Wangbo Zhao, Kai Wang, Xiaojiang Peng, Wenqi Shao, Hongxun Yao, Kaipeng Zhang

Reasoning is an essential capacity for large language models (LLMs) to address complex tasks, where the identification of process errors is vital for improving this ability.

Multimodal Reasoning

PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models

no code implementations16 Mar 2025 Zhaopan Xu, Pengfei Zhou, Weidong Tang, Jiaxin Ai, Wangbo Zhao, Xiaojiang Peng, Kai Wang, Yang You, Wenqi Shao, Hongxun Yao, Kaipeng Zhang

In recent years, Multimodal Large Language Models (MLLMs) have demonstrated remarkable advancements in tasks such as visual question answering, visual understanding, and reasoning.

Machine Unlearning Privacy Preserving +2

Is Graph Convolution Always Beneficial For Every Feature?

no code implementations12 Nov 2024 Yilun Zheng, Xiang Li, Sitao Luan, Xiaojiang Peng, Lihui Chen

In prior studies, to assess the impacts of graph convolution on features, people proposed metrics based on feature homophily to measure feature consistency with the graph topology.

feature selection Informativeness

Rethinking Structure Learning For Graph Neural Networks

no code implementations12 Nov 2024 Yilun Zheng, Zhuofan Zhang, ZiMing Wang, Xiang Li, Sitao Luan, Xiaojiang Peng, Lihui Chen

Surprisingly, our empirical observations and theoretical analysis show that no matter which type of graph structure construction methods are used, after feeding the same GSL bases to the newly constructed graph, there is no MI gain compared to the original GSL bases.

Graph structure learning

Mamba-Enhanced Text-Audio-Video Alignment Network for Emotion Recognition in Conversations

1 code implementation8 Sep 2024 Xinran Li, Xiaomao Fan, Qingyang Wu, Xiaojiang Peng, Ye Li

MaTAV is with the advantages of aligning unimodal features to ensure consistency across different modalities and handling long input sequences to better capture contextual multimodal information.

Emotion Recognition Mamba +2

DPDEdit: Detail-Preserved Diffusion Models for Multimodal Fashion Image Editing

no code implementations2 Sep 2024 Xiaolong Wang, Zhi-Qi Cheng, Jue Wang, Xiaojiang Peng

To address these challenges, we introduce a new multimodal fashion image editing architecture based on latent diffusion models, called Detail-Preserved Diffusion Models (DPDEdit).

Image Generation Language Modelling +3

FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing

1 code implementation22 Aug 2024 Jue Wang, Yuxiang Lin, Tianshuo Yuan, Zhi-Qi Cheng, Xiaolong Wang, Jiao GH, Wei Chen, Xiaojiang Peng

Our approach employs a VLLM in comprehending the image content, mask, and user instructions.

SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion Recognition

1 code implementation20 Aug 2024 Zebang Cheng, Shuyuan Tu, Dawei Huang, Minghan Li, Xiaojiang Peng, Zhi-Qi Cheng, Alexander G. Hauptmann

This paper presents our winning approach for the MER-NOISE and MER-OV tracks of the MER2024 Challenge on multimodal emotion recognition.

Multimodal Emotion Recognition

DSMix: Distortion-Induced Sensitivity Map Based Pre-training for No-Reference Image Quality Assessment

1 code implementation4 Jul 2024 Jinsong Shi, Pan Gao, Xiaojiang Peng, Jie Qin

It applies cut and mix operations to diverse categories of synthetic distorted images, assigning confidence scores to class labels based on the aforementioned prior knowledge.

Data Augmentation Knowledge Distillation +1

Dataset Growth

1 code implementation28 May 2024 Ziheng Qin, Zhaopan Xu, Yukun Zhou, Zangwei Zheng, Zebang Cheng, Hao Tang, Lei Shang, Baigui Sun, Xiaojiang Peng, Radu Timofte, Hongxun Yao, Kai Wang, Yang You

To tackle this challenge, we propose InfoGrowth, an efficient online algorithm for data cleaning and selection, resulting in a growing dataset that keeps up to date with awareness of cleanliness and diversity.

Diversity

UMETTS: A Unified Framework for Emotional Text-to-Speech Synthesis with Multimodal Prompts

1 code implementation29 Apr 2024 Zhi-Qi Cheng, Xiang Li, Jun-Yan He, Junyao Chen, Xiaomao Fan, Xiaojiang Peng, Alexander G. Hauptmann

Emotional Text-to-Speech (E-TTS) synthesis has garnered significant attention in recent years due to its potential to revolutionize human-computer interaction.

Contrastive Learning Speech Synthesis +3

Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer

no code implementations26 Apr 2024 Xinpeng Li, Teng Wang, Jian Zhao, Shuyi Mao, Jinbao Wang, Feng Zheng, Xiaojiang Peng, Xuelong Li

Emotion recognition aims to discern the emotional state of subjects within an image, relying on subject-centric and contextual visual cues.

Emotion Classification Emotion Recognition

Invisible Gas Detection: An RGB-Thermal Cross Attention Network and A New Benchmark

1 code implementation26 Mar 2024 Jue Wang, Yuxiang Lin, Qi Zhao, Dong Luo, Shuaibao Chen, Wei Chen, Xiaojiang Peng

The widespread use of various chemical gases in industrial processes necessitates effective measures to prevent their leakage during transportation and storage, given their high toxicity.

A Challenge Dataset and Effective Models for Conversational Stance Detection

1 code implementation17 Mar 2024 Fuqiang Niu, Min Yang, Ang Li, Baoquan Zhang, Xiaojiang Peng, BoWen Zhang

Previous stance detection studies typically concentrate on evaluating stances within individual instances, thereby exhibiting limitations in effectively modeling multi-party discussions concerning the same specific topic, as naturally transpire in authentic social media interactions.

Stance Detection

3D Landmark Detection on Human Point Clouds: A Benchmark and A Dual Cascade Point Transformer Framework

no code implementations14 Jan 2024 Fan Zhang, Shuyi Mao, Qing Li, Xiaojiang Peng

Comparative evaluations with popular point-based methods on HPoint103 and the public dataset DHP19 demonstrate the dramatic outperformance of our D-CPT.

Decoder Pose Estimation +1

MIMIC: Mask Image Pre-training with Mix Contrastive Fine-tuning for Facial Expression Recognition

no code implementations14 Jan 2024 Fan Zhang, Xiaobao Guo, Xiaojiang Peng, Alex Kot

In addition, when compared with the domain disparity existing between face datasets and FER datasets, the divergence between general datasets and FER datasets is more pronounced.

Contrastive Learning Face Recognition +3

The Snowflake Hypothesis: Training Deep GNN with One Node One Receptive field

no code implementations19 Aug 2023 Kun Wang, Guohao Li, Shilong Wang, Guibin Zhang, Kai Wang, Yang You, Xiaojiang Peng, Yuxuan Liang, Yang Wang

Despite Graph Neural Networks demonstrating considerable promise in graph representation learning tasks, GNNs predominantly face significant issues with over-fitting and over-smoothing as they go deeper as models of computer vision realm.

Graph Representation Learning

Rail Detection: An Efficient Row-based Network and A New Benchmark

1 code implementation12 Apr 2023 Xinpeng Li, Xiaojiang Peng

Inspired by the growth of lane detection, we propose a rail database and a row-based rail detection method.

Anomaly Detection Lane Detection

AU-Aware Vision Transformers for Biased Facial Expression Recognition

no code implementations12 Nov 2022 Shuyi Mao, Xinpeng Li, Qingyang Wu, Xiaojiang Peng

Studies have proven that domain bias and label bias exist in different Facial Expression Recognition (FER) datasets, making it hard to improve the performance of a specific dataset by adding other datasets.

Domain Adaptation Facial Expression Recognition +1

AU-Supervised Convolutional Vision Transformers for Synthetic Facial Expression Recognition

1 code implementation20 Jul 2022 Shuyi Mao, Xinpeng Li, Junyao Chen, Xiaojiang Peng

In Learing from Synthetic Data(LSD) task, facial expression recognition (FER) methods aim to learn the representation of expression from the artificially generated data and generalise to real data.

Face Recognition Facial Expression Recognition +1

Video-based Smoky Vehicle Detection with A Coarse-to-Fine Framework

no code implementations8 Jul 2022 Xiaojiang Peng, Xiaomao Fan, Qingyang Wu, Jieyan Zhao, Pan Gao

Moreover, we present a new Coarse-to-fine Deep Smoky vehicle detection (CoDeS) framework for efficient smoky vehicle detection.

Video Frame Interpolation Based on Deformable Kernel Region

1 code implementation25 Apr 2022 Haoyue Tian, Pan Gao, Xiaojiang Peng

In order to solve this problem, we revisit the deformable convolution for video interpolation, which can break the fixed grid restrictions on the kernel region, making the distribution of reference points more suitable for the shape of the object, and thus warp a more accurate interpolation frame.

Optical Flow Estimation Video Frame Interpolation

Self-Ensemling for 3D Point Cloud Domain Adaption

no code implementations10 Dec 2021 Qing Li, Xiaojiang Peng, Chuan Yan, Pan Gao, Qi Hao

In SEN, a student network is kept in a collaborative manner with supervised learning and self-supervised learning, and a teacher network conducts temporal consistency to learn useful representations and ensure the quality of point clouds reconstruction.

Autonomous Driving Self-Supervised Learning +1

Spatial and Temporal Networks for Facial Expression Recognition in the Wild Videos

1 code implementation12 Jul 2021 Shuyi Mao, Xinqi Fan, Xiaojiang Peng

The paper describes our proposed methodology for the seven basic expression classification track of Affective Behavior Analysis in-the-wild (ABAW) Competition 2021.

Face Recognition Facial Expression Recognition +1

An Efficient Training Approach for Very Large Scale Face Recognition

1 code implementation CVPR 2022 Kai Wang, Shuo Wang, Panpan Zhang, Zhipeng Zhou, Zheng Zhu, Xiaobo Wang, Xiaojiang Peng, Baigui Sun, Hao Li, Yang You

This method adopts Dynamic Class Pool (DCP) for storing and updating the identities features dynamically, which could be regarded as a substitute for the FC layer.

 Ranked #1 on Face Verification on IJB-C (training dataset metric)

Face Recognition Face Verification

Affordance Transfer Learning for Human-Object Interaction Detection

2 code implementations CVPR 2021 Zhi Hou, Baosheng Yu, Yu Qiao, Xiaojiang Peng, DaCheng Tao

The proposed method can thus be used to 1) improve the performance of HOI detection, especially for the HOIs with unseen objects; and 2) infer the affordances of novel objects.

Affordance Detection Affordance Recognition +4

Detecting Human-Object Interaction via Fabricated Compositional Learning

1 code implementation CVPR 2021 Zhi Hou, Baosheng Yu, Yu Qiao, Xiaojiang Peng, DaCheng Tao

With the proposed object fabricator, we are able to generate large-scale HOI samples for rare and unseen categories to alleviate the open long-tailed issues in HOI detection.

Affordance Recognition Object +1

Unsupervised Person Re-Identification with Multi-Label Learning Guided Self-Paced Clustering

no code implementations8 Mar 2021 Qing Li, Xiaojiang Peng, Yu Qiao, Qi Hao

The multi-label learning module leverages a memory feature bank and assigns each image with a multi-label vector based on the similarities between the image and feature bank.

Clustering Multi-Label Learning +2

AU-Guided Unsupervised Domain Adaptive Facial Expression Recognition

no code implementations18 Dec 2020 Kai Wang, Yuxin Gu, Xiaojiang Peng, Panpan Zhang, Baigui Sun, Hao Li

The domain diversities including inconsistent annotation and varied image collection conditions inevitably exist among different facial expression recognition (FER) datasets, which pose an evident challenge for adapting the FER model trained on one dataset to another one.

Facial Expression Recognition Facial Expression Recognition (FER) +1

Suppressing Mislabeled Data via Grouping and Self-Attention

1 code implementation ECCV 2020 Xiaojiang Peng, Kai Wang, Zhaoyang Zeng, Qing Li, Jianfei Yang, Yu Qiao

Specifically, this plug-and-play AFM first leverages a \textit{group-to-attend} module to construct groups and assign attention weights for group-wise samples, and then uses a \textit{mixup} module with the attention weights to interpolate massive noisy-suppressed samples.

Image Classification

Visual Compositional Learning for Human-Object Interaction Detection

4 code implementations ECCV 2020 Zhi Hou, Xiaojiang Peng, Yu Qiao, DaCheng Tao

The integration of decomposition and composition enables VCL to share object and verb features among different HOI samples and images, and to generate new interaction samples and new types of HOI, and thus largely alleviates the long-tail distribution problem and benefits low-shot or zero-shot HOI detection.

Affordance Recognition Object

Suppressing Uncertainties for Large-Scale Facial Expression Recognition

2 code implementations CVPR 2020 Kai Wang, Xiaojiang Peng, Jianfei Yang, Shijian Lu, Yu Qiao

Annotating a qualitative large-scale facial expression dataset is extremely difficult due to the uncertainties caused by ambiguous facial expressions, low-quality facial images, and the subjectiveness of annotators.

Facial Expression Recognition Facial Expression Recognition (FER)

A Comprehensive Study on Temporal Modeling for Online Action Detection

1 code implementation21 Jan 2020 Wen Wang, Xiaojiang Peng, Yu Qiao, Jian Cheng

Online action detection (OAD) is a practical yet challenging task, which has attracted increasing attention in recent years.

Online Action Detection

Learning Category Correlations for Multi-label Image Recognition with Graph Networks

no code implementations28 Sep 2019 Qing Li, Xiaojiang Peng, Yu Qiao, Qiang Peng

In this paper, instead of using a pre-defined graph which is inflexible and may be sub-optimal for multi-label classification, we propose the A-GCN, which leverages the popular Graph Convolutional Networks with an Adaptive label correlation graph to model label dependencies.

Multi-Label Classification MUlTI-LABEL-ClASSIFICATION +2

Product Image Recognition with Guidance Learning and Noisy Supervision

no code implementations26 Jul 2019 Qing Li, Xiaojiang Peng, Liangliang Cao, Wenbin Du, Hao Xing, Yu Qiao

Instead of collecting product images by labor-and time-intensive image capturing, we take advantage of the web and download images from the reviews of several e-commerce websites where the images are casually captured by consumers.

Bootstrap Model Ensemble and Rank Loss for Engagement Intensity Regression

no code implementations8 Jul 2019 Kai Wang, Jianfei Yang, Da Guo, Kaipeng Zhang, Xiaojiang Peng, Yu Qiao

Based on our winner solution last year, we mainly explore head features and body features with a bootstrap strategy and two novel loss functions in this paper.

regression

Frame attention networks for facial expression recognition in videos

2 code implementations29 Jun 2019 Debin Meng, Xiaojiang Peng, Kai Wang, Yu Qiao

The feature embedding module is a deep Convolutional Neural Network (CNN) which embeds face images into feature vectors.

Ranked #3 on Facial Expression Recognition (FER) on CK+ (Accuracy (7 emotion) metric)

Facial Expression Recognition Facial Expression Recognition (FER)

A Study on Unsupervised Dictionary Learning and Feature Encoding for Action Classification

no code implementations2 Sep 2013 Xiaojiang Peng, Qiang Peng, Yu Qiao, Junzhou Chen, Mehtab Afzal

Many efforts have been devoted to develop alternative methods to traditional vector quantization in image domain such as sparse coding and soft-assignment.

Action Classification Dictionary Learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.