Search Results for author: XiaoJun Wu

Found 27 papers, 7 papers with code

Noise Learning for Text Classification: A Benchmark

no code implementations COLING 2022 Bo Liu, Wandi Xu, Yuejia Xiang, XiaoJun Wu, Lejian He, BoWen Zhang, Li Zhu

However, we find that noise learning in text classification is relatively underdeveloped: 1. many methods that have been proven effective in the image domain are not explored in text classification, 2. it is difficult to conduct a fair comparison between previous studies as they do experiments in different noise settings.

text-classification Text Classification

One Model for ALL: Low-Level Task Interaction Is a Key to Task-Agnostic Image Fusion

1 code implementation27 Feb 2025 Chunyang Cheng, Tianyang Xu, ZhenHua Feng, XiaoJun Wu, ZhangyongTang, Hui Li, Zeyang Zhang, Sara Atito, Muhammad Awais, Josef Kittler

Advanced image fusion methods mostly prioritise high-level missions, where task interaction struggles with semantic gaps, requiring complex bridging mechanisms.

All

Trunk-branch Contrastive Network with Multi-view Deformable Aggregation for Multi-view Action Recognition

no code implementations23 Feb 2025 Yingyuan Yang, Guoyuan Liang, Can Wang, XiaoJun Wu

Drawing inspiration from this cognitive process, we propose a novel trunk-branch contrastive network (TBCNet) for RGB-based multi-view action recognition.

Action Recognition Contrastive Learning

Subpixel Edge Localization Based on Converted Intensity Summation under Stable Edge Region

no code implementations23 Feb 2025 Yingyuan Yang, Guoyuan Liang, Xianwen Wang, Kaiming Wang, Can Wang, XiaoJun Wu

We take an innovative perspective by assuming that the intensity at the pixel level can be interpreted as a local integral mapping in the intensity model for subpixel localization.

Edge Detection

MixSA: Training-free Reference-based Sketch Extraction via Mixture-of-Self-Attention

no code implementations1 Jan 2025 Rui Yang, XiaoJun Wu, Shengfeng He

By aligning brushstroke styles with the texture and contours of colored images, particularly in late decoder layers handling local textures, MixSA addresses the common issue of color averaging by adjusting initial outlines.

Decoder

Adaptive Hyper-Graph Convolution Network for Skeleton-based Human Action Recognition with Virtual Connections

no code implementations22 Nov 2024 Youwei Zhou, Tianyang Xu, Cong Wu, XiaoJun Wu, Josef Kittler

The shared topology of human skeletons motivated the recent investigation of graph convolutional network (GCN) solutions for action recognition.

Action Recognition Temporal Action Localization

Golden Touchstone: A Comprehensive Bilingual Benchmark for Evaluating Financial Large Language Models

1 code implementation9 Nov 2024 XiaoJun Wu, Junxi Liu, Huanyi Su, Zhouchi Lin, Yiyan Qi, Chengjin Xu, Jiajun Su, Jiajie Zhong, Fuwei Wang, Saizhuo Wang, Fengrui Hua, Jia Li, Jian Guo

As large language models become increasingly prevalent in the financial sector, there is a pressing need for a standardized method to comprehensively assess their performance.

Capture Artifacts via Progressive Disentangling and Purifying Blended Identities for Deepfake Detection

no code implementations14 Oct 2024 Weijie Zhou, Xiaoqing Luo, Zhancheng Zhang, Jiachen He, XiaoJun Wu

To address these issues, a Deepfake detection method based on progressive disentangling and purifying blended identities is innovatively proposed in this paper.

DeepFake Detection Disentanglement +1

QuadBEV: An Efficient Quadruple-Task Perception Framework via Bird's-Eye-View Representation

no code implementations9 Oct 2024 Yuxin Li, Yiheng Li, Xulei Yang, Mengying Yu, Zihang Huang, XiaoJun Wu, Chai Kiat Yeo

Bird's-Eye-View (BEV) perception has become a vital component of autonomous driving systems due to its ability to integrate multiple sensor inputs into a unified representation, enhancing performance in various downstream tasks.

3D Object Detection Autonomous Driving +2

Learning Content-Aware Multi-Modal Joint Input Pruning via Bird's-Eye-View Representation

no code implementations9 Oct 2024 Yuxin Li, Yiheng Li, Xulei Yang, Mengying Yu, Zihang Huang, XiaoJun Wu, Chai Kiat Yeo

In the landscape of autonomous driving, Bird's-Eye-View (BEV) representation has recently garnered substantial academic attention, serving as a transformative framework for the fusion of multi-modal sensor inputs.

Autonomous Driving Computational Efficiency +1

RMLR: Extending Multinomial Logistic Regression into General Geometries

2 code implementations28 Sep 2024 Ziheng Chen, Yue Song, Rui Wang, XiaoJun Wu, Nicu Sebe

Specifically, we showcase our framework on the Symmetric Positive Definite (SPD) manifold and special orthogonal group, i. e., the set of rotation matrices.

regression

Dynamic Subframe Splitting and Spatio-Temporal Motion Entangled Sparse Attention for RGB-E Tracking

no code implementations26 Sep 2024 Pengcheng Shao, Tianyang Xu, XueFeng Zhu, XiaoJun Wu, Josef Kittler

Based on this, we design an event-based sparse attention mechanism to enhance the interaction of event features in temporal and spatial dimensions.

CoMoFusion: Fast and High-quality Fusion of Infrared and Visible Image with Consistency Model

1 code implementation31 May 2024 Zhiming Meng, Hui Li, Zeyang Zhang, Zhongwei Shen, Yunlong Yu, Xiaoning Song, XiaoJun Wu

Generative models are widely utilized to model the distribution of fused images in the field of infrared and visible image fusion.

Infrared And Visible Image Fusion

S4Fusion: Saliency-aware Selective State Space Model for Infrared Visible Image Fusion

no code implementations31 May 2024 Haolong Ma, Hui Li, Chunyang Cheng, Gaoang Wang, Xiaoning Song, XiaoJun Wu

However, in image fusion, current methods underestimate the potential of SSSM in capturing the global spatial information of both modalities.

Infrared And Visible Image Fusion

Taiyi-Diffusion-XL: Advancing Bilingual Text-to-Image Generation with Large Vision-Language Model Support

1 code implementation26 Jan 2024 XiaoJun Wu, Dixiang Zhang, Ruyi Gan, Junyu Lu, Ziwei Wu, Renliang Sun, Jiaxing Zhang, Pingjian Zhang, Yan Song

Recent advancements in text-to-image models have significantly enhanced image generation capabilities, yet a notable gap of open-source models persists in bilingual or Chinese language support.

Language Modeling Language Modelling +1

iDesigner: A High-Resolution and Complex-Prompt Following Text-to-Image Diffusion Model for Interior Design

no code implementations7 Dec 2023 Ruyi Gan, XiaoJun Wu, Junyu Lu, Yuanhe Tian, Dixiang Zhang, Ziwei Wu, Renliang Sun, Chang Liu, Jiaxing Zhang, Pingjian Zhang, Yan Song

However, there are few specialized models in certain domains, such as interior design, which is attributed to the complex textual descriptions and detailed visual elements inherent in design, alongside the necessity for adaptable resolution.

Image Generation

Ziya2: Data-centric Learning is All LLMs Need

no code implementations6 Nov 2023 Ruyi Gan, Ziwei Wu, Renliang Sun, Junyu Lu, XiaoJun Wu, Dixiang Zhang, Kunhao Pan, Junqing He, Yuanhe Tian, Ping Yang, Qi Yang, Hao Wang, Jiaxing Zhang, Yan Song

Although many such issues are addressed along the line of research on LLMs, an important yet practical limitation is that many studies overly pursue enlarging model sizes without comprehensively analyzing and optimizing the use of pre-training data in their learning process, as well as appropriate organization and leveraging of such data in training LLMs under cost-effective settings.

All

Ziya-Visual: Bilingual Large Vision-Language Model via Multi-Task Instruction Tuning

no code implementations12 Oct 2023 Junyu Lu, Dixiang Zhang, XiaoJun Wu, Xinyu Gao, Ruyi Gan, Jiaxing Zhang, Yan Song, Pingjian Zhang

Recent advancements enlarge the capabilities of large language models (LLMs) in zero-shot image-to-text generation and understanding by integrating multi-modal inputs.

Image Captioning Image-text Retrieval +8

Edge Based Oriented Object Detection

no code implementations15 Sep 2023 Jianghu Shen, XiaoJun Wu

In the field of remote sensing, we often utilize oriented bounding boxes (OBB) to bound the objects.

Object object-detection +3

Riemannian Multinomial Logistics Regression for SPD Neural Networks

2 code implementations CVPR 2024 Ziheng Chen, Yue Song, Gaowen Liu, Ramana Rao Kompella, XiaoJun Wu, Nicu Sebe

Besides, our framework offers a novel intrinsic explanation for the most popular LogEig classifier in existing SPD networks.

Action Recognition EEG +2

LabelPrompt: Effective Prompt-based Learning for Relation Classification

no code implementations16 Feb 2023 Wenjie Zhang, Xiaoning Song, ZhenHua Feng, Tianyang Xu, XiaoJun Wu

Specifically, associating natural language words that fill the masked token with semantic relation labels (\textit{e. g.} \textit{``org:founded\_by}'') is difficult.

Classification Contrastive Learning +3

Simple Primitives with Feasibility- and Contextuality-Dependence for Open-World Compositional Zero-shot Learning

no code implementations5 Nov 2022 Zhe Liu, Yun Li, Lina Yao, Xiaojun Chang, Wei Fang, XiaoJun Wu, Yi Yang

We design Semantic Attention (SA) and generative Knowledge Disentanglement (KD) to learn the dependence of feasibility and contextuality, respectively.

Compositional Zero-Shot Learning Disentanglement

PPT Fusion: Pyramid Patch Transformerfor a Case Study in Image Fusion

no code implementations29 Jul 2021 Yu Fu, Tianyang Xu, XiaoJun Wu, Josef Kittler

In this paper, we propose a Patch Pyramid Transformer(PPT) to effectively address the above issues. Specifically, we first design a Patch Transformer to transform the image into a sequence of patches, where transformer encoding is performed for each patch to extract local representations.

Image Classification Image Reconstruction

a high efficiency fully convolutional networks for pixel wise surface defect detection

no code implementations journal 2019 Lingteng Qiu, XiaoJun Wu, ZHIYANG YU

Our method is composed of a segmentation stage (stage 1), a detection stage (stage 2), and a matting stage (stage 3).

Defect Detection Image Matting +1

Cannot find the paper you are looking for? You can Submit a new open access paper.