Search Results for author: Xun Gong

Found 26 papers, 5 papers with code

RepFace: Refining Closed-Set Noise with Progressive Label Correction for Face Recognition

no code implementations16 Dec 2024 Jie Zhang, Xun Gong, Zhonglin Sun

However, face recognition performance is heavily affected by the label noise, especially closed-set noise.

Face Recognition

CLIP-based Camera-Agnostic Feature Learning for Intra-camera Person Re-Identification

1 code implementation29 Sep 2024 Xuan Tan, Xun Gong, Yang Xiang

To address this, we propose a novel framework called CLIP-based Camera-Agnostic Feature Learning (CCAFL) for ICS ReID.

Person Re-Identification

LCE: A Framework for Explainability of DNNs for Ultrasound Image Based on Concept Discovery

no code implementations19 Aug 2024 Weiji Kong, Xun Gong, Juan Wang

Explaining the decisions of Deep Neural Networks (DNNs) for medical images has become increasingly important.

Diagnostic

HcNet: Image Modeling with Heat Conduction Equation

1 code implementation12 Aug 2024 Zhemin Zhang, Xun Gong

This finding inspired us to model images by the heat conduction equation, where the essential idea is to conceptualize image features as temperatures and model their information interaction as the diffusion of thermal energy.

Tri-VQA: Triangular Reasoning Medical Visual Question Answering for Multi-Attribute Analysis

no code implementations21 Jun 2024 Lin Fan, Xun Gong, Cenyang Zheng, Yafei Ou

However, existing Med-VQA methods based on joint embedding fail to explain whether their provided results are based on correct reasoning or coincidental answers, which undermines the credibility of VQA answers.

Attribute Medical Visual Question Answering +2

MDA: An Interpretable and Scalable Multi-Modal Fusion under Missing Modalities and Intrinsic Noise Conditions

no code implementations15 Jun 2024 Lin Fan, Yafei Ou, Cenyang Zheng, Pengyu Dai, Tamotsu Kamishima, Masayuki Ikebe, Kenji Suzuki, Xun Gong

MDA constructs linear relationships between modalities through continuous attention, due to its ability to adaptively allocate dynamic attention to different modalities, MDA can reduce attention to low-correlation data, missing modalities, or modalities with inherent noise, thereby maintaining SOTA performance across various tasks on multiple public datasets.

Diagnostic

Generating Multi-Center Classifier via Conditional Gaussian Distribution

1 code implementation29 Jan 2024 Zhemin Zhang, Xun Gong

Specifically, we create a conditional Gaussian distribution for each class and then sample multiple sub-centers from that distribution to extend the linear classifier.

Image Classification

Prospective Role of Foundation Models in Advancing Autonomous Vehicles

no code implementations8 Dec 2023 Jianhua Wu, Bingzhao Gao, Jincheng Gao, Jianhao Yu, Hongqing Chu, Qiankun Yu, Xun Gong, Yi Chang, H. Eric Tseng, Hong Chen, Jie Chen

With the development of artificial intelligence and breakthroughs in deep learning, large-scale Foundation Models (FMs), such as GPT, Sora, etc., have achieved remarkable results in many fields including natural language processing and computer vision.

Autonomous Driving Scene Understanding +1

Vision Big Bird: Random Sparsification for Full Attention

no code implementations10 Nov 2023 Zhemin Zhang, Xun Gong

Inspired by one of the most successful transformers-based models for NLP: Big Bird, we propose a novel sparse attention mechanism for Vision Transformers (ViT).

Adversarial Driving Behavior Generation Incorporating Human Risk Cognition for Autonomous Vehicle Evaluation

no code implementations29 Sep 2023 Zhen Liu, Hang Gao, Hao Ma, Shuo Cai, Yunfeng Hu, Ting Qu, Hong Chen, Xun Gong

Autonomous vehicle (AV) evaluation has been the subject of increased interest in recent years both in industry and in academia.

Reinforcement Learning (RL)

On Data-Driven Modeling and Control in Modern Power Grids Stability: Survey and Perspective

no code implementations7 Aug 2023 Xun Gong, Xiaozhe Wang, Bo Cao

Modern power grids are fast evolving with the increasing volatile renewable generation, distributed energy resources (DERs) and time-varying operating conditions.

LongFNT: Long-form Speech Recognition with Factorized Neural Transducer

no code implementations17 Nov 2022 Xun Gong, Yu Wu, Jinyu Li, Shujie Liu, Rui Zhao, Xie Chen, Yanmin Qian

This motivates us to leverage the factorized neural transducer structure, containing a real language model, the vocabulary predictor.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

BoundaryFace: A mining framework with noise label self-correction for Face Recognition

1 code implementation10 Oct 2022 Shijie Wu, Xun Gong

Specifically, a closed-set noise label self-correction module is put forward, making this framework work well on datasets containing a lot of label noise.

Face Recognition

SpeechLM: Enhanced Speech Pre-Training with Unpaired Textual Data

1 code implementation30 Sep 2022 Ziqiang Zhang, Sanyuan Chen, Long Zhou, Yu Wu, Shuo Ren, Shujie Liu, Zhuoyuan Yao, Xun Gong, LiRong Dai, Jinyu Li, Furu Wei

In this paper, we propose a cross-modal Speech and Language Model (SpeechLM) to explicitly align speech and text pre-training with a pre-defined unified discrete representation.

Language Modeling Language Modelling +2

Axially Expanded Windows for Local-Global Interaction in Vision Transformers

no code implementations19 Sep 2022 Zhemin Zhang, Xun Gong

Recently, Transformers have shown promising performance in various vision tasks.

An Online Data-Driven Method for Microgrid Secondary Voltage and Frequency Control with Ensemble Koopman Modeling

no code implementations11 Jul 2022 Xun Gong, Xiaozhe Wang, Geza Joos

Low inertia, nonlinearity and a high level of uncertainty (varying topologies and operating conditions) pose challenges to microgrid (MG) systemwide operation.

Event Detection

Self-Supervised Implicit Attention: Guided Attention by The Model Itself

no code implementations15 Jun 2022 Jinyi Wu, Xun Gong, Zhemin Zhang

To verify the effectiveness of SSIA, we performed a particular implementation (called an SSIA block) in convolutional neural network models and validated it on several image classification datasets.

Image Classification Self-Supervised Learning

Positional Label for Self-Supervised Vision Transformer

no code implementations10 Jun 2022 Zhemin Zhang, Xun Gong

Positional encoding is important for vision transformer (ViT) to capture the spatial structure of the input image.

Position

ReplaceBlock: An improved regularization method based on background information

no code implementations30 Mar 2022 Zhemin Zhang, Xun Gong, Jinyi Wu

In this way, ReplaceBlock can effectively simulate the feature map of the occluded image.

Object

The Fixed Sub-Center: A Better Way to Capture Data Complexity

no code implementations24 Mar 2022 Zhemin Zhang, Xun Gong

The F-SC specifically, first samples a class center Ui for each class from a uniform distribution, and then generates a normal distribution for each class, where the mean is equal to Ui.

Image Classification

A Two-Stage Data-Free Adversarial Patch Generation Framework

no code implementations29 Sep 2021 Jiawei Liu, Hang Gao, Yunfeng Hu, Xun Gong

The proxy dataset selection stage calculates the proposed average patch saliency (APS) of each available dataset to select a high-APS proxy dataset that can guarantee patches' fooling abilities.

Vocal Bursts Valence Prediction

Achievable Rates of Opportunistic Cognitive Radio Systems Using Reconfigurable Antennas with Imperfect Sensing and Channel Estimation

no code implementations8 Jul 2020 Hassan Yazdani, Azadeh Vosoughi, Xun Gong

We establish a lower bound on the achievable rates of SUtx-SUrx link, in the presence of spectrum sensing and channel estimation errors, and errors due to incorrect detection of the beam corresponding to PU's location and incorrect selection of the strongest beam for data transmission.

Three-Stream Convolutional Networks for Video-based Person Re-Identification

no code implementations22 Nov 2017 Zeng Yu, Tianrui Li, Ning Yu, Xun Gong, Ke Chen, Yi Pan

This paper aims to develop a new architecture that can make full use of the feature maps of convolutional networks.

Video-Based Person Re-Identification

Cannot find the paper you are looking for? You can Submit a new open access paper.