1 code implementation • 23 Nov 2024 • Jiawei E, Yinglong Zhang, Xuewen Xia, Xing Xu
Additionally, to assess the effectiveness of our proposed model, we have applied it to citation sentiment prediction, a novel task previously unexplored in the GNN field.
1 code implementation • 9 Oct 2024 • Ang He, Ximei Wu, Xing Xu, Jing Chen, Xiaobin Guo, Sheng Xu
The model's backbone includes an Adaptive Lightweight Channel Splitting and Shuffling (ALSS) module to improve information exchange between channels and optimize feature extraction, aiding accurate crop identification.
1 code implementation • 2 Sep 2024 • Yixuan Zhou, Xing Xu, Zhe Sun, Jingkuan Song, Andrzej Cichocki, Heng Tao Shen
Through the integration of vector quantization (VQ), we empower the flow models to distinguish different concepts of multi-class normal data in an unsupervised manner, resulting in a novel flow-based unified method, named VQ-Flow.
1 code implementation • SIGIR 2024 • Quanxing Zha, Xin Liu, Yiu-ming Cheung, Xing Xu, Nannan Wang, Jianjia Cao
Cross-modal matching has recently gained significant popularity to facilitate retrieval across multi-modal data, and existing works are highly relied on an implicit assumption that the training data pairs are perfectly aligned.
Cross-modal retrieval with noisy correspondence
Image-text matching
+1
no code implementations • 23 Jun 2024 • Ni Wang, Dongliang Liao, Xing Xu
We extract the inter-frame difference feature and integrate the difference and frame feature by the multi-scale temporal transformer.
no code implementations • CVPR 2024 • Zixian Gao, Xun Jiang, Xing Xu, Fumin Shen, Yujie Li, Heng Tao Shen
However in this paper we show that the potential noise in unimodal data could be well quantified and further employed to enhance more stable unimodal embeddings via contrastive learning.
1 code implementation • 26 Nov 2023 • Yixuan Zhou, Yi Qu, Xing Xu, Fumin Shen, Jingkuan Song, HengTao Shen
In the proposed BN-WVAD, we leverage the Divergence of Feature from Mean vector (DFM) of BatchNorm as a reliable abnormality criterion to discern potential abnormal snippets in abnormal videos.
Anomaly Detection In Surveillance Videos
Video Anomaly Detection
1 code implementation • Conference 2023 • Zhiguo Chen, Xun Jiang, Xing Xu, Zuo Cao, Yijun Mo, and Heng Tao Shen
Based on the glance-to-gaze design, our JSG method learns two separate joint embedding spaces for moments and text queries using a hybrid synergistic contrastive learning strategy.
1 code implementation • 12 Oct 2023 • Yixuan Zhou, Xuanhan Wang, Xing Xu, Lei Zhao, Jingkuan Song
Inspired by this observation, we introduce a lightweight and powerful alternative, Spatially Unidimensional Self-Attention (SUSA), to the pointwise (1x1) convolution that is the main computational bottleneck in the depthwise separable 3c3 convolution.
1 code implementation • 29 Aug 2023 • Yixuan Zhou, Xing Xu, Jingkuan Song, Fumin Shen, Heng Tao Shen
Unsupervised anomaly detection (UAD) attracts a lot of research interest and drives widespread applications, where only anomaly-free samples are available for training.
Ranked #9 on
Anomaly Detection
on MVTec AD
no code implementations • 16 Aug 2023 • Jiabang He, Liu Jia, Lei Wang, Xiyao Li, Xing Xu
However, they struggle with semantically rich real-world entities due to limited structural information and fail to generalize to unseen entities.
Ranked #1 on
Link Prediction
on OpenBG500
2 code implementations • ICCV 2023 • Yixuan Zhou, Yi Qu, Xing Xu, HengTao Shen
To overcome this bottleneck, we leverage class priors to restrict the generalization scope of the class-agnostic SAM and propose a class-aware smoothness optimization algorithm named Imbalanced-SAM (ImbSAM).
Semi-supervised Anomaly Detection
Supervised Anomaly Detection
1 code implementation • 8 Aug 2023 • Yi Bin, Haoxuan Li, Yahui Xu, Xing Xu, Yang Yang, Heng Tao Shen
Specifically, on two key tasks, \textit{i. e.}, image-to-text and text-to-image retrieval, HAT achieves 7. 6\% and 16. 7\% relative score improvement of Recall@1 on MSCOCO, and 4. 4\% and 11. 6\% on Flickr30k respectively.
1 code implementation • 5 Jun 2023 • Jiabang He, Yi Hu, Lei Wang, Xing Xu, Ning Liu, Hui Liu, Heng Tao Shen
Results from the experiments demonstrate that there is a significant performance gap between the in-distribution (ID) and OOD settings for document images, and that fine-grained analysis of distribution shifts can reveal the brittle nature of existing pre-trained VDU models and OOD generalization algorithms.
1 code implementation • 30 May 2023 • Yixuan Zhou, Peiyu Yang, Yi Qu, Xing Xu, Zhe Sun, Andrzej Cichocki
Unlike existing SSAD methods that resort to strict loss supervision, AnoOnly suspends it and introduces a form of weak supervision for normal data.
Semi-supervised Anomaly Detection
Supervised Anomaly Detection
+1
no code implementations • 23 May 2023 • Xun Jiang, Zailei Zhou, Xing Xu, Yang Yang, Guoqing Wang, Heng Tao Shen
Existing VMR methods suffer from two defects: (1) massive expensive temporal annotations are required to obtain satisfying performance; (2) complicated cross-modal interaction modules are deployed, which lead to high computational cost and low efficiency for the retrieval process.
1 code implementation • 5 May 2023 • Lei Wang, Yi Hu, Jiabang He, Xing Xu, Ning Liu, Hui Liu, Heng Tao Shen
To address these issues, we propose a novel method termed T-SciQ that aims at teaching science question answering with LLM signals.
2 code implementations • 4 Apr 2023 • Zhiqiang Hu, Lei Wang, Yihuai Lan, Wanyu Xu, Ee-Peng Lim, Lidong Bing, Xing Xu, Soujanya Poria, Roy Ka-Wei Lee
The success of large language models (LLMs), like GPT-4 and ChatGPT, has led to the development of numerous cost-effective and accessible alternatives that are created by finetuning open-access LLMs with task-specific data (e. g., ChatDoctor) or instruction data (e. g., Alpaca).
1 code implementation • ICCV 2023 • Jiabang He, Lei Wang, Yi Hu, Ning Liu, Hui Liu, Xing Xu, Heng Tao Shen
To this end, we propose a simple but effective in-context learning framework called ICL-D3IE, which enables LLMs to perform DIE with different types of demonstration examples.
no code implementations • 26 Dec 2022 • Jiwei Wei, Yang Yang, Zeyu Ma, Jingjing Li, Xing Xu, Heng Tao Shen
In this paper, we provide a new semantic enhanced knowledge graph that contains both expert knowledge and categories semantic correlation.
1 code implementation • 27 Nov 2022 • Lei Wang, Jiabang He, Xing Xu, Ning Liu, Hui Liu
In this paper, we propose a new model architecture with alignment-enriched tuning (dubbed AETNet) upon pre-trained document image models, to adapt downstream tasks with the joint task-specific supervised and alignment-aware contrastive objective.
no code implementations • 24 May 2022 • Yifeng Zhou, Xing Xu, Shuaicheng Liu, Guoqing Wang, Huimin Lu, Heng Tao Shen
To achieve promising results on removing noise from real-world images, most of existing denoising networks are formulated with complex network structure, making them impractical for deployment.
no code implementations • 30 Jan 2022 • Xing Xu, Rongpeng Li, Zhifeng Zhao, Honggang Zhang
The paper considers independent reinforcement learning (IRL) for multi-agent decision-making process in the paradigm of federated learning (FL).
no code implementations • CVPR 2022 • Xun Jiang, Xing Xu, Jingran Zhang, Fumin Shen, Zuo Cao, Heng Tao Shen
Video events grounding aims at retrieving the most relevant moments from an untrimmed video in terms of a given natural language query.
no code implementations • 3 Dec 2021 • Zhangzhi Zhao, Zhengying Lou, Ruibo Wang, Qingyao Li, Xing Xu
The experimental results show that the I-WKNN algorithm has obvious advantages in positioning accuracy and positioning time in a complex noise environment and has obvious application potential in a smart stadium.
1 code implementation • ICCV 2021 • Yuyu Guo, Lianli Gao, Xuanhan Wang, Yuxuan Hu, Xing Xu, Xu Lu, Heng Tao Shen, Jingkuan Song
The scene graph generation (SGG) task aims to detect visual relationship triplets, i. e., subject, predicate, object, in an image, providing a structural vision layout for scene understanding.
no code implementations • CVPR 2021 • Mingxing Zhang, Yang Yang, Xinghan Chen, Yanli Ji, Xing Xu, Jingjing Li, Heng Tao Shen
Then for a moment candidate, we concatenate the starting/middle/ending representations of its starting/middle/ending elements respectively to form the final moment representation.
no code implementations • CVPR 2021 • Yangye Fu, Ming Zhang, Xing Xu, Zuo Cao, Chao Ma, Yanli Ji, Kai Zuo, Huimin Lu
By assuming that the source and target domains share consistent key feature representations and identical label space, existing studies on MSDA typically utilize the entire union set of features from both the source and target domains to obtain the feature map and align the map for each category and domain.
1 code implementation • 25 May 2021 • Lianli Gao, Yaya Cheng, Qilong Zhang, Xing Xu, Jingkuan Song
However, the current choice of pixel-wise Euclidean Distance to measure the discrepancy is questionable because it unreasonably imposes a spatial-consistency constraint on the source and target features.
no code implementations • 24 Mar 2021 • Xing Xu, Rongpeng Li, Zhifeng Zhao, Honggang Zhang
The paper considers independent reinforcement learning (IRL) for multi-agent collaborative decision-making in the paradigm of federated learning (FL).
1 code implementation • CVPR 2020 • Jiwei Wei, Xing Xu, Yang Yang, Yanli Ji, Zheng Wang, Heng Tao Shen
Furthermore, we introduce a new polynomial loss under the universal weighting framework, which defines a weight function for the positive and negative informative pairs respectively.
no code implementations • CVPR 2020 • Xing Xu, Jiefu Chen, Jinhui Xiao, Lianli Gao, Fumin Shen, Heng Tao Shen
Specifically, we propose a novel and efficient optimization-based method that can be naturally integrated to different sequential prediction schemes, i. e., connectionist temporal classification (CTC) and attention mechanism.
no code implementations • 28 Nov 2019 • Xing Xu, Rongpeng Li, Zhifeng Zhao, Honggang Zhang
With the rapid evolution of wireless mobile devices, there emerges an increased need to design effective collaboration mechanisms between intelligent agents, so as to gradually approach the final collective objective through continuously learning from the environment based on their individual observations.
no code implementations • 27 Aug 2019 • Jingran Zhang, Fumin Shen, Xing Xu, Heng Tao Shen
It extracts this complementary information of different modality from a connection block, which aims at exploring correlations of different stream features.
Ranked #15 on
Action Recognition
on HMDB-51
(using extra training data)
no code implementations • 27 Aug 2019 • Jingran Zhang, Fumin Shen, Xing Xu, Heng Tao Shen
In this paper, we propose an efficient temporal reasoning graph (TRG) to simultaneously capture the appearance features and temporal relation between video sequences at multiple time scales.
Ranked #54 on
Action Recognition
on Something-Something V1
2 code implementations • 12 Aug 2019 • Tan Wang, Xing Xu, Yang Yang, Alan Hanjalic, Heng Tao Shen, Jingkuan Song
We propose a novel framework that achieves remarkable matching performance with acceptable model complexity.
1 code implementation • AAAI 2019 • Lei Wang, Dongxiang Zhang, Jipeng Zhang, Xing Xu, Lianli Gao, Bing Tian Dai, Heng Tao Shen
Then, we design a recursive neural network to encode the quantity with Bi-LSTM and self attention, and infer the unknown operator nodes in a bottom-up manner.
no code implementations • 26 Apr 2019 • Rongpeng Li, Zhifeng Zhao, Xing Xu, Fei Ni, Honggang Zhang
Afterwards, we highlight the potential huge impact of CI on both communications and intelligence.
no code implementations • CVPR 2017 • Xing Xu, Fumin Shen, Yang Yang, Dongxiang Zhang, Heng Tao Shen, Jingkuan Song
By additionally introducing manifold regularizations on visual data and semantic embeddings, the learned projection can effectively captures the geometrical manifold structure residing in both visual and semantic spaces.
no code implementations • 26 Jan 2017 • Jingkuan Song, Tao He, Lianli Gao, Xing Xu, Heng Tao Shen
Specifically, DRH is an end-to-end deep neural network which consists of object proposal, feature extraction, and hash code generation.
no code implementations • 15 Jun 2016 • Yi Bin, Yang Yang, Zi Huang, Fumin Shen, Xing Xu, Heng Tao Shen
Video captioning has been attracting broad research attention in multimedia community.