Search Results for author: Xing Xu

Found 34 papers, 16 papers with code

BatchNorm-based Weakly Supervised Video Anomaly Detection

1 code implementation26 Nov 2023 Yixuan Zhou, Yi Qu, Xing Xu, Fumin Shen, Jingkuan Song, HengTao Shen

In the proposed BN-WVAD, we leverage the Divergence of Feature from Mean vector (DFM) of BatchNorm as a reliable abnormality criterion to discern potential abnormal snippets in abnormal videos.

Anomaly Detection In Surveillance Videos Video Anomaly Detection

X-HRNet: Towards Lightweight Human Pose Estimation with Spatially Unidimensional Self-Attention

1 code implementation12 Oct 2023 Yixuan Zhou, Xuanhan Wang, Xing Xu, Lei Zhao, Jingkuan Song

Inspired by this observation, we introduce a lightweight and powerful alternative, Spatially Unidimensional Self-Attention (SUSA), to the pointwise (1x1) convolution that is the main computational bottleneck in the depthwise separable 3c3 convolution.

Pose Estimation

MSFlow: Multi-Scale Flow-based Framework for Unsupervised Anomaly Detection

1 code implementation29 Aug 2023 Yixuan Zhou, Xing Xu, Jingkuan Song, Fumin Shen, Heng Tao Shen

Unsupervised anomaly detection (UAD) attracts a lot of research interest and drives widespread applications, where only anomaly-free samples are available for training.

Unsupervised Anomaly Detection

MoCoSA: Momentum Contrast for Knowledge Graph Completion with Structure-Augmented Pre-trained Language Models

no code implementations16 Aug 2023 Jiabang He, Liu Jia, Lei Wang, Xiyao Li, Xing Xu

However, they struggle with semantically rich real-world entities due to limited structural information and fail to generalize to unseen entities.

Entity Embeddings Link Prediction

ImbSAM: A Closer Look at Sharpness-Aware Minimization in Class-Imbalanced Recognition

1 code implementation ICCV 2023 Yixuan Zhou, Yi Qu, Xing Xu, HengTao Shen

To overcome this bottleneck, we leverage class priors to restrict the generalization scope of the class-agnostic SAM and propose a class-aware smoothness optimization algorithm named Imbalanced-SAM (ImbSAM).

Semi-supervised Anomaly Detection Supervised Anomaly Detection

Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval

1 code implementation8 Aug 2023 Yi Bin, Haoxuan Li, Yahui Xu, Xing Xu, Yang Yang, Heng Tao Shen

Specifically, on two key tasks, \textit{i. e.}, image-to-text and text-to-image retrieval, HAT achieves 7. 6\% and 16. 7\% relative score improvement of Recall@1 on MSCOCO, and 4. 4\% and 11. 6\% on Flickr30k respectively.

Cross-Modal Retrieval Image Retrieval +1

Do-GOOD: Towards Distribution Shift Evaluation for Pre-Trained Visual Document Understanding Models

1 code implementation5 Jun 2023 Jiabang He, Yi Hu, Lei Wang, Xing Xu, Ning Liu, Hui Liu, Heng Tao Shen

Results from the experiments demonstrate that there is a significant performance gap between the in-distribution (ID) and OOD settings for document images, and that fine-grained analysis of distribution shifts can reveal the brittle nature of existing pre-trained VDU models and OOD generalization algorithms.

document understanding Question Answering

AnoOnly: Semi-Supervised Anomaly Detection with the Only Loss on Anomalies

1 code implementation30 May 2023 Yixuan Zhou, Peiyu Yang, Yi Qu, Xing Xu, Zhe Sun, Andrzej Cichocki

Unlike existing SSAD methods that resort to strict loss supervision, AnoOnly suspends it and introduces a form of weak supervision for normal data.

Semi-supervised Anomaly Detection Supervised Anomaly Detection +1

Faster Video Moment Retrieval with Point-Level Supervision

no code implementations23 May 2023 Xun Jiang, Zailei Zhou, Xing Xu, Yang Yang, Guoqing Wang, Heng Tao Shen

Existing VMR methods suffer from two defects: (1) massive expensive temporal annotations are required to obtain satisfying performance; (2) complicated cross-modal interaction modules are deployed, which lead to high computational cost and low efficiency for the retrieval process.

Moment Retrieval Natural Language Queries +1

LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models

2 code implementations4 Apr 2023 Zhiqiang Hu, Lei Wang, Yihuai Lan, Wanyu Xu, Ee-Peng Lim, Lidong Bing, Xing Xu, Soujanya Poria, Roy Ka-Wei Lee

The success of large language models (LLMs), like GPT-4 and ChatGPT, has led to the development of numerous cost-effective and accessible alternatives that are created by finetuning open-access LLMs with task-specific data (e. g., ChatDoctor) or instruction data (e. g., Alpaca).

Arithmetic Reasoning Language Modelling

ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction

1 code implementation ICCV 2023 Jiabang He, Lei Wang, Yi Hu, Ning Liu, Hui Liu, Xing Xu, Heng Tao Shen

To this end, we propose a simple but effective in-context learning framework called ICL-D3IE, which enables LLMs to perform DIE with different types of demonstration examples.

Document AI In-Context Learning

Semantic Enhanced Knowledge Graph for Large-Scale Zero-Shot Learning

no code implementations26 Dec 2022 Jiwei Wei, Yang Yang, Zeyu Ma, Jingjing Li, Xing Xu, Heng Tao Shen

In this paper, we provide a new semantic enhanced knowledge graph that contains both expert knowledge and categories semantic correlation.

Zero-Shot Learning

Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models

1 code implementation27 Nov 2022 Lei Wang, Jiabang He, Xing Xu, Ning Liu, Hui Liu

In this paper, we propose a new model architecture with alignment-enriched tuning (dubbed AETNet) upon pre-trained document image models, to adapt downstream tasks with the joint task-specific supervised and alignment-aware contrastive objective.

Thunder: Thumbnail based Fast Lightweight Image Denoising Network

no code implementations24 May 2022 Yifeng Zhou, Xing Xu, Shuaicheng Liu, Guoqing Wang, Huimin Lu, Heng Tao Shen

To achieve promising results on removing noise from real-world images, most of existing denoising networks are formulated with complex network structure, making them impractical for deployment.

Image Denoising SSIM

Communication-Efficient Consensus Mechanism for Federated Reinforcement Learning

no code implementations30 Jan 2022 Xing Xu, Rongpeng Li, Zhifeng Zhao, Honggang Zhang

The paper considers independent reinforcement learning (IRL) for multi-agent decision-making process in the paradigm of federated learning (FL).

Decision Making Federated Learning +2

Semi-Supervised Video Paragraph Grounding With Contrastive Encoder

no code implementations CVPR 2022 Xun Jiang, Xing Xu, Jingran Zhang, Fumin Shen, Zuo Cao, Heng Tao Shen

Video events grounding aims at retrieving the most relevant moments from an untrimmed video in terms of a given natural language query.

Sentence Video Grounding

I-WKNN: Fast-Speed and High-Accuracy WIFI Positioning for Intelligent Stadiums

no code implementations3 Dec 2021 Zhangzhi Zhao, Zhengying Lou, Ruibo Wang, Qingyao Li, Xing Xu

The experimental results show that the I-WKNN algorithm has obvious advantages in positioning accuracy and positioning time in a complex noise environment and has obvious application potential in a smart stadium.

From General to Specific: Informative Scene Graph Generation via Balance Adjustment

1 code implementation ICCV 2021 Yuyu Guo, Lianli Gao, Xuanhan Wang, Yuxuan Hu, Xing Xu, Xu Lu, Heng Tao Shen, Jingkuan Song

The scene graph generation (SGG) task aims to detect visual relationship triplets, i. e., subject, predicate, object, in an image, providing a structural vision layout for scene understanding.

Blocking Graph Generation +2

Multi-Stage Aggregated Transformer Network for Temporal Language Localization in Videos

no code implementations CVPR 2021 Mingxing Zhang, Yang Yang, Xinghan Chen, Yanli Ji, Xing Xu, Jingjing Li, Heng Tao Shen

Then for a moment candidate, we concatenate the starting/middle/ending representations of its starting/middle/ending elements respectively to form the final moment representation.

Sentence

Partial Feature Selection and Alignment for Multi-Source Domain Adaptation

no code implementations CVPR 2021 Yangye Fu, Ming Zhang, Xing Xu, Zuo Cao, Chao Ma, Yanli Ji, Kai Zuo, Huimin Lu

By assuming that the source and target domains share consistent key feature representations and identical label space, existing studies on MSDA typically utilize the entire union set of features from both the source and target domains to obtain the feature map and align the map for each category and domain.

feature selection Partial Domain Adaptation

Feature Space Targeted Attacks by Statistic Alignment

1 code implementation25 May 2021 Lianli Gao, Yaya Cheng, Qilong Zhang, Xing Xu, Jingkuan Song

However, the current choice of pixel-wise Euclidean Distance to measure the discrepancy is questionable because it unreasonably imposes a spatial-consistency constraint on the source and target features.

Translation

The Gradient Convergence Bound of Federated Multi-Agent Reinforcement Learning with Efficient Communication

no code implementations24 Mar 2021 Xing Xu, Rongpeng Li, Zhifeng Zhao, Honggang Zhang

The paper considers independent reinforcement learning (IRL) for multi-agent collaborative decision-making in the paradigm of federated learning (FL).

Decision Making Federated Learning +3

Universal Weighting Metric Learning for Cross-Modal Matching

1 code implementation CVPR 2020 Jiwei Wei, Xing Xu, Yang Yang, Yanli Ji, Zheng Wang, Heng Tao Shen

Furthermore, we introduce a new polynomial loss under the universal weighting framework, which defines a weight function for the positive and negative informative pairs respectively.

Image-text matching Metric Learning +1

What Machines See Is Not What They Get: Fooling Scene Text Recognition Models With Adversarial Text Images

no code implementations CVPR 2020 Xing Xu, Jiefu Chen, Jinhui Xiao, Lianli Gao, Fumin Shen, Heng Tao Shen

Specifically, we propose a novel and efficient optimization-based method that can be naturally integrated to different sequential prediction schemes, i. e., connectionist temporal classification (CTC) and attention mechanism.

Adversarial Text General Classification +3

Stigmergic Independent Reinforcement Learning for Multi-Agent Collaboration

no code implementations28 Nov 2019 Xing Xu, Rongpeng Li, Zhifeng Zhao, Honggang Zhang

With the rapid evolution of wireless mobile devices, there emerges an increased need to design effective collaboration mechanisms between intelligent agents, so as to gradually approach the final collective objective through continuously learning from the environment based on their individual observations.

reinforcement-learning Reinforcement Learning (RL)

Temporal Reasoning Graph for Activity Recognition

no code implementations27 Aug 2019 Jingran Zhang, Fumin Shen, Xing Xu, Heng Tao Shen

In this paper, we propose an efficient temporal reasoning graph (TRG) to simultaneously capture the appearance features and temporal relation between video sequences at multiple time scales.

Action Recognition Relation +1

Cooperative Cross-Stream Network for Discriminative Action Representation

no code implementations27 Aug 2019 Jingran Zhang, Fumin Shen, Xing Xu, Heng Tao Shen

It extracts this complementary information of different modality from a connection block, which aims at exploring correlations of different stream features.

Ranked #15 on Action Recognition on HMDB-51 (using extra training data)

Action Recognition Temporal Action Localization

Template-based math word problem solvers with recursive neural networks

1 code implementation AAAI 2019 Lei Wang, Dongxiang Zhang, Jipeng Zhang, Xing Xu, Lianli Gao, Bing Tian Dai, Heng Tao Shen

Then, we design a recursive neural network to encode the quantity with Bi-LSTM and self attention, and infer the unknown operator nodes in a bottom-up manner.

Math

Internet of Intelligence: The Collective Advantage for Advancing Communications and Intelligence

no code implementations26 Apr 2019 Rongpeng Li, Zhifeng Zhao, Xing Xu, Fei Ni, Honggang Zhang

Afterwards, we highlight the potential huge impact of CI on both communications and intelligence.

Matrix Tri-Factorization With Manifold Regularizations for Zero-Shot Learning

no code implementations CVPR 2017 Xing Xu, Fumin Shen, Yang Yang, Dongxiang Zhang, Heng Tao Shen, Jingkuan Song

By additionally introducing manifold regularizations on visual data and semantic embeddings, the learned projection can effectively captures the geometrical manifold structure residing in both visual and semantic spaces.

Retrieval Transfer Learning +1

Deep Region Hashing for Efficient Large-scale Instance Search from Images

no code implementations26 Jan 2017 Jingkuan Song, Tao He, Lianli Gao, Xing Xu, Heng Tao Shen

Specifically, DRH is an end-to-end deep neural network which consists of object proposal, feature extraction, and hash code generation.

Code Generation Image Retrieval +3

Cannot find the paper you are looking for? You can Submit a new open access paper.