Search Results for author: Xing Xu

Found 34 papers, 16 papers with code

BatchNorm-based Weakly Supervised Video Anomaly Detection

1 code implementation • 26 Nov 2023 • Yixuan Zhou, Yi Qu, Xing Xu, Fumin Shen, Jingkuan Song, HengTao Shen

In the proposed BN-WVAD, we leverage the Divergence of Feature from Mean vector (DFM) of BatchNorm as a reliable abnormality criterion to discern potential abnormal snippets in abnormal videos.

Ranked #1 on Anomaly Detection In Surveillance Videos on UCF-Crime

Anomaly Detection In Surveillance Videos Video Anomaly Detection

Paper
Code

X-HRNet: Towards Lightweight Human Pose Estimation with Spatially Unidimensional Self-Attention

1 code implementation • 12 Oct 2023 • Yixuan Zhou, Xuanhan Wang, Xing Xu, Lei Zhao, Jingkuan Song

Inspired by this observation, we introduce a lightweight and powerful alternative, Spatially Unidimensional Self-Attention (SUSA), to the pointwise (1x1) convolution that is the main computational bottleneck in the depthwise separable 3c3 convolution.

Pose Estimation

Paper
Code

MSFlow: Multi-Scale Flow-based Framework for Unsupervised Anomaly Detection

1 code implementation • 29 Aug 2023 • Yixuan Zhou, Xing Xu, Jingkuan Song, Fumin Shen, Heng Tao Shen

Unsupervised anomaly detection (UAD) attracts a lot of research interest and drives widespread applications, where only anomaly-free samples are available for training.

Ranked #5 on Anomaly Detection on MVTec AD

Unsupervised Anomaly Detection

Paper
Code

MoCoSA: Momentum Contrast for Knowledge Graph Completion with Structure-Augmented Pre-trained Language Models

no code implementations • 16 Aug 2023 • Jiabang He, Liu Jia, Lei Wang, Xiyao Li, Xing Xu

However, they struggle with semantically rich real-world entities due to limited structural information and fail to generalize to unseen entities.

Ranked #1 on Link Prediction on WN18RR

Entity Embeddings Link Prediction

Paper
Add Code

ImbSAM: A Closer Look at Sharpness-Aware Minimization in Class-Imbalanced Recognition

1 code implementation • ICCV 2023 • Yixuan Zhou, Yi Qu, Xing Xu, HengTao Shen

To overcome this bottleneck, we leverage class priors to restrict the generalization scope of the class-agnostic SAM and propose a class-aware smoothness optimization algorithm named Imbalanced-SAM (ImbSAM).

Semi-supervised Anomaly Detection Supervised Anomaly Detection

Paper
Code

Unifying Two-Stream Encoders with Transformers for Cross-Modal Retrieval

1 code implementation • 8 Aug 2023 • Yi Bin, Haoxuan Li, Yahui Xu, Xing Xu, Yang Yang, Heng Tao Shen

Specifically, on two key tasks, \textit{i. e.}, image-to-text and text-to-image retrieval, HAT achieves 7. 6\% and 16. 7\% relative score improvement of Recall@1 on MSCOCO, and 4. 4\% and 11. 6\% on Flickr30k respectively.

Cross-Modal Retrieval Image Retrieval +1

Paper
Code

Do-GOOD: Towards Distribution Shift Evaluation for Pre-Trained Visual Document Understanding Models

1 code implementation • 5 Jun 2023 • Jiabang He, Yi Hu, Lei Wang, Xing Xu, Ning Liu, Hui Liu, Heng Tao Shen

Results from the experiments demonstrate that there is a significant performance gap between the in-distribution (ID) and OOD settings for document images, and that fine-grained analysis of distribution shifts can reveal the brittle nature of existing pre-trained VDU models and OOD generalization algorithms.

document understanding Question Answering

Paper
Code

AnoOnly: Semi-Supervised Anomaly Detection with the Only Loss on Anomalies

1 code implementation • 30 May 2023 • Yixuan Zhou, Peiyu Yang, Yi Qu, Xing Xu, Zhe Sun, Andrzej Cichocki

Unlike existing SSAD methods that resort to strict loss supervision, AnoOnly suspends it and introduces a form of weak supervision for normal data.

Semi-supervised Anomaly Detection Supervised Anomaly Detection +1

Paper
Code

Faster Video Moment Retrieval with Point-Level Supervision

no code implementations • 23 May 2023 • Xun Jiang, Zailei Zhou, Xing Xu, Yang Yang, Guoqing Wang, Heng Tao Shen

Existing VMR methods suffer from two defects: (1) massive expensive temporal annotations are required to obtain satisfying performance; (2) complicated cross-modal interaction modules are deployed, which lead to high computational cost and low efficiency for the retrieval process.

Moment Retrieval Natural Language Queries +1

Paper
Add Code

T-SciQ: Teaching Multimodal Chain-of-Thought Reasoning via Mixed Large Language Model Signals for Science Question Answering

1 code implementation • 5 May 2023 • Lei Wang, Yi Hu, Jiabang He, Xing Xu, Ning Liu, Hui Liu, Heng Tao Shen

To address these issues, we propose a novel method termed T-SciQ that aims at teaching science question answering with LLM signals.

Language Modelling Large Language Model +1

Paper
Code

LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models

2 code implementations • 4 Apr 2023 • Zhiqiang Hu, Lei Wang, Yihuai Lan, Wanyu Xu, Ee-Peng Lim, Lidong Bing, Xing Xu, Soujanya Poria, Roy Ka-Wei Lee

The success of large language models (LLMs), like GPT-4 and ChatGPT, has led to the development of numerous cost-effective and accessible alternatives that are created by finetuning open-access LLMs with task-specific data (e. g., ChatDoctor) or instruction data (e. g., Alpaca).

Arithmetic Reasoning Language Modelling

936

Paper
Code

ICL-D3IE: In-Context Learning with Diverse Demonstrations Updating for Document Information Extraction

1 code implementation • ICCV 2023 • Jiabang He, Lei Wang, Yi Hu, Ning Liu, Hui Liu, Xing Xu, Heng Tao Shen

To this end, we propose a simple but effective in-context learning framework called ICL-D3IE, which enables LLMs to perform DIE with different types of demonstration examples.

Document AI In-Context Learning

Paper
Code

Semantic Enhanced Knowledge Graph for Large-Scale Zero-Shot Learning

no code implementations • 26 Dec 2022 • Jiwei Wei, Yang Yang, Zeyu Ma, Jingjing Li, Xing Xu, Heng Tao Shen

In this paper, we provide a new semantic enhanced knowledge graph that contains both expert knowledge and categories semantic correlation.

Zero-Shot Learning

Paper
Add Code

Alignment-Enriched Tuning for Patch-Level Pre-trained Document Image Models

1 code implementation • 27 Nov 2022 • Lei Wang, Jiabang He, Xing Xu, Ning Liu, Hui Liu

In this paper, we propose a new model architecture with alignment-enriched tuning (dubbed AETNet) upon pre-trained document image models, to adapt downstream tasks with the joint task-specific supervised and alignment-aware contrastive objective.

Paper
Code

Thunder: Thumbnail based Fast Lightweight Image Denoising Network

no code implementations • 24 May 2022 • Yifeng Zhou, Xing Xu, Shuaicheng Liu, Guoqing Wang, Huimin Lu, Heng Tao Shen

To achieve promising results on removing noise from real-world images, most of existing denoising networks are formulated with complex network structure, making them impractical for deployment.

Image Denoising SSIM

Paper
Add Code

Communication-Efficient Consensus Mechanism for Federated Reinforcement Learning

no code implementations • 30 Jan 2022 • Xing Xu, Rongpeng Li, Zhifeng Zhao, Honggang Zhang

The paper considers independent reinforcement learning (IRL) for multi-agent decision-making process in the paradigm of federated learning (FL).

Decision Making Federated Learning +2

Paper
Add Code

Semi-Supervised Video Paragraph Grounding With Contrastive Encoder

no code implementations • CVPR 2022 • Xun Jiang, Xing Xu, Jingran Zhang, Fumin Shen, Zuo Cao, Heng Tao Shen

Video events grounding aims at retrieving the most relevant moments from an untrimmed video in terms of a given natural language query.

Sentence Video Grounding

Paper
Add Code

I-WKNN: Fast-Speed and High-Accuracy WIFI Positioning for Intelligent Stadiums

no code implementations • 3 Dec 2021 • Zhangzhi Zhao, Zhengying Lou, Ruibo Wang, Qingyao Li, Xing Xu

The experimental results show that the I-WKNN algorithm has obvious advantages in positioning accuracy and positioning time in a complex noise environment and has obvious application potential in a smart stadium.

Paper
Add Code

From General to Specific: Informative Scene Graph Generation via Balance Adjustment

1 code implementation • ICCV 2021 • Yuyu Guo, Lianli Gao, Xuanhan Wang, Yuxuan Hu, Xing Xu, Xu Lu, Heng Tao Shen, Jingkuan Song

The scene graph generation (SGG) task aims to detect visual relationship triplets, i. e., subject, predicate, object, in an image, providing a structural vision layout for scene understanding.

Blocking Graph Generation +2

Paper
Code

Multi-Stage Aggregated Transformer Network for Temporal Language Localization in Videos

no code implementations • CVPR 2021 • Mingxing Zhang, Yang Yang, Xinghan Chen, Yanli Ji, Xing Xu, Jingjing Li, Heng Tao Shen

Then for a moment candidate, we concatenate the starting/middle/ending representations of its starting/middle/ending elements respectively to form the final moment representation.

Sentence

Paper
Add Code

Partial Feature Selection and Alignment for Multi-Source Domain Adaptation

no code implementations • CVPR 2021 • Yangye Fu, Ming Zhang, Xing Xu, Zuo Cao, Chao Ma, Yanli Ji, Kai Zuo, Huimin Lu

By assuming that the source and target domains share consistent key feature representations and identical label space, existing studies on MSDA typically utilize the entire union set of features from both the source and target domains to obtain the feature map and align the map for each category and domain.

feature selection Partial Domain Adaptation

Paper
Add Code

Feature Space Targeted Attacks by Statistic Alignment

1 code implementation • 25 May 2021 • Lianli Gao, Yaya Cheng, Qilong Zhang, Xing Xu, Jingkuan Song

However, the current choice of pixel-wise Euclidean Distance to measure the discrepancy is questionable because it unreasonably imposes a spatial-consistency constraint on the source and target features.

Translation

Paper
Code

The Gradient Convergence Bound of Federated Multi-Agent Reinforcement Learning with Efficient Communication

no code implementations • 24 Mar 2021 • Xing Xu, Rongpeng Li, Zhifeng Zhao, Honggang Zhang

The paper considers independent reinforcement learning (IRL) for multi-agent collaborative decision-making in the paradigm of federated learning (FL).

Decision Making Federated Learning +3

Paper
Add Code

Universal Weighting Metric Learning for Cross-Modal Matching

1 code implementation • CVPR 2020 • Jiwei Wei, Xing Xu, Yang Yang, Yanli Ji, Zheng Wang, Heng Tao Shen

Furthermore, we introduce a new polynomial loss under the universal weighting framework, which defines a weight function for the positive and negative informative pairs respectively.

Image-text matching Metric Learning +1

Paper
Code

What Machines See Is Not What They Get: Fooling Scene Text Recognition Models With Adversarial Text Images

no code implementations • CVPR 2020 • Xing Xu, Jiefu Chen, Jinhui Xiao, Lianli Gao, Fumin Shen, Heng Tao Shen

Specifically, we propose a novel and efficient optimization-based method that can be naturally integrated to different sequential prediction schemes, i. e., connectionist temporal classification (CTC) and attention mechanism.

Adversarial Text General Classification +3

Paper
Add Code

Stigmergic Independent Reinforcement Learning for Multi-Agent Collaboration

no code implementations • 28 Nov 2019 • Xing Xu, Rongpeng Li, Zhifeng Zhao, Honggang Zhang

With the rapid evolution of wireless mobile devices, there emerges an increased need to design effective collaboration mechanisms between intelligent agents, so as to gradually approach the final collective objective through continuously learning from the environment based on their individual observations.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Temporal Reasoning Graph for Activity Recognition

no code implementations • 27 Aug 2019 • Jingran Zhang, Fumin Shen, Xing Xu, Heng Tao Shen

In this paper, we propose an efficient temporal reasoning graph (TRG) to simultaneously capture the appearance features and temporal relation between video sequences at multiple time scales.

Ranked #53 on Action Recognition on Something-Something V1

Action Recognition Relation +1

Paper
Add Code

Cooperative Cross-Stream Network for Discriminative Action Representation

no code implementations • 27 Aug 2019 • Jingran Zhang, Fumin Shen, Xing Xu, Heng Tao Shen

It extracts this complementary information of different modality from a connection block, which aims at exploring correlations of different stream features.

Ranked #15 on Action Recognition on HMDB-51 (using extra training data)

Action Recognition Temporal Action Localization

Paper
Add Code

Matching Images and Text with Multi-modal Tensor Fusion and Re-ranking

2 code implementations • 12 Aug 2019 • Tan Wang, Xing Xu, Yang Yang, Alan Hanjalic, Heng Tao Shen, Jingkuan Song

We propose a novel framework that achieves remarkable matching performance with acceptable model complexity.

Binary Classification General Classification +4

Paper
Code

Template-based math word problem solvers with recursive neural networks

1 code implementation • AAAI 2019 • Lei Wang, Dongxiang Zhang, Jipeng Zhang, Xing Xu, Lianli Gao, Bing Tian Dai, Heng Tao Shen

Then, we design a recursive neural network to encode the quantity with Bi-LSTM and self attention, and infer the unknown operator nodes in a bottom-up manner.

Math

155

Paper
Code

Internet of Intelligence: The Collective Advantage for Advancing Communications and Intelligence

no code implementations • 26 Apr 2019 • Rongpeng Li, Zhifeng Zhao, Xing Xu, Fei Ni, Honggang Zhang

Afterwards, we highlight the potential huge impact of CI on both communications and intelligence.

Paper
Add Code

Matrix Tri-Factorization With Manifold Regularizations for Zero-Shot Learning

no code implementations • CVPR 2017 • Xing Xu, Fumin Shen, Yang Yang, Dongxiang Zhang, Heng Tao Shen, Jingkuan Song

By additionally introducing manifold regularizations on visual data and semantic embeddings, the learned projection can effectively captures the geometrical manifold structure residing in both visual and semantic spaces.

Retrieval Transfer Learning +1

Paper
Add Code

Deep Region Hashing for Efficient Large-scale Instance Search from Images

no code implementations • 26 Jan 2017 • Jingkuan Song, Tao He, Lianli Gao, Xing Xu, Heng Tao Shen

Specifically, DRH is an end-to-end deep neural network which consists of object proposal, feature extraction, and hash code generation.

Code Generation Image Retrieval +3

Paper
Add Code

Bidirectional Long-Short Term Memory for Video Description

no code implementations • 15 Jun 2016 • Yi Bin, Yang Yang, Zi Huang, Fumin Shen, Xing Xu, Heng Tao Shen

Video captioning has been attracting broad research attention in multimedia community.

Language Modelling Video Captioning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.