Search Results for author: Xirong Li

Found 50 papers, 32 papers with code

PhD: A Prompted Visual Hallucination Evaluation Dataset

1 code implementation17 Mar 2024 Jiazhen Liu, Yuhan Fu, Ruobing Xie, Runquan Xie, Xingwu Sun, Fengzong Lian, Zhanhui Kang, Xirong Li

The rapid growth of Large Language Models (LLMs) has driven the development of Large Vision-Language Models (LVLMs).

Attribute Common Sense Reasoning +2

Holistic Features are almost Sufficient for Text-to-Video Retrieval

1 code implementation CVPR 2024 Kaibin Tian, Ruixiang Zhao, Zijie Xin, Bangxiang Lan, Xirong Li

For text-to-video retrieval (T2VR) which aims to retrieve unlabeled videos by ad-hoc textual queries CLIP-based methods currently lead the way.

Retrieval text similarity +2

Adaptive Fusion of Radiomics and Deep Features for Lung Adenocarcinoma Subtype Recognition

no code implementations27 Aug 2023 Jing Zhou, Xiaotong Fu, Xirong Li, Wei Feng, Zhang Zhang, Ying Ji

The most common type of lung cancer, lung adenocarcinoma (LUAD), has been increasingly detected since the advent of low-dose computed tomography screening technology.

TeachCLIP: Multi-Grained Teaching for Efficient Text-to-Video Retrieval

no code implementations2 Aug 2023 Kaibin Tian, Ruixiang Zhao, Hu Hu, Runquan Xie, Fengzong Lian, Zhanhui Kang, Xirong Li

For efficient T2VR, we propose TeachCLIP with multi-grained teaching to let a CLIP4Clip based student network learn from more advanced yet computationally heavy models such as X-CLIP, TS2-Net and X-Pool .

Retrieval text similarity +2

Geometrized Transformer for Self-Supervised Homography Estimation

1 code implementation ICCV 2023 Jiazhen Liu, Xirong Li

Such a homography allows us to compute cross-attention in a focused manner, where key/value sets required by Transformers can be reduced to small fix-sized regions rather than an entire image.

Homography Estimation

SAFL-Net: Semantic-Agnostic Feature Learning Network with Auxiliary Plugins for Image Manipulation Detection

no code implementations ICCV 2023 Zhihao Sun, Haoran Jiang, Danding Wang, Xirong Li, Juan Cao

Since image editing methods in real world scenarios cannot be exhausted, generalization is a core challenge for image manipulation detection, which could be severely weakened by semantically related features.

Image Manipulation Image Manipulation Detection

Partially Relevant Video Retrieval

1 code implementation26 Aug 2022 Jianfeng Dong, Xianke Chen, Minsong Zhang, Xun Yang, ShuJie Chen, Xirong Li, Xun Wang

To fill the gap, we propose in this paper a novel T2VR subtask termed Partially Relevant Video Retrieval (PRVR).

Moment Retrieval Multiple Instance Learning +5

Semi-Supervised Keypoint Detector and Descriptor for Retinal Image Matching

1 code implementation16 Jul 2022 Jiazhen Liu, Xirong Li, Qijie Wei, Jie Xu, Dayong Ding

To attack the incompleteness of manual labeling, we propose Progressive Keypoint Expansion to enrich the keypoint labels at each training epoch.

Image Registration

Learn to Understand Negation in Video Retrieval

1 code implementation30 Apr 2022 Ziyue Wang, Aozhu Chen, Fan Hu, Xirong Li

We propose a learning based method for training a negation-aware video retrieval model.

Natural Language Queries Negation +3

Co-Teaching for Unsupervised Domain Adaptation and Expansion

1 code implementation4 Apr 2022 Kaibin Tian, Qijie Wei, Xirong Li

Such sorts of samples are typically in minority in their host domain, so they tend to be overlooked by the domain-specific model, but can be better handled by a model from the other domain.

Image Classification Knowledge Distillation +3

DRAG: Dynamic Region-Aware GCN for Privacy-Leaking Image Detection

1 code implementation17 Mar 2022 Guang Yang, Juan Cao, Qiang Sheng, Peng Qi, Xirong Li, Jintao Li

However, these methods have two limitations: 1) they neglect other important elements like scenes, textures, and objects beyond the capacity of pretrained object detectors; 2) the correlation among objects is fixed, but a fixed correlation is not appropriate for all the images.

Deepfake Network Architecture Attribution

1 code implementation28 Feb 2022 Tianyun Yang, Ziyao Huang, Juan Cao, Lei LI, Xirong Li

With the rapid progress of generation technology, it has become necessary to attribute the origin of fake images.

Attribute DeepFake Detection +2

Article Reranking by Memory-Enhanced Key Sentence Matching for Detecting Previously Fact-Checked Claims

1 code implementation ACL 2021 Qiang Sheng, Juan Cao, Xueyao Zhang, Xirong Li, Lei Zhong

By fusing event and pattern information, we select key sentences to represent an article and then predict if the article fact-checks the given claim using the claim, key sentences, and patterns.

Fact Checking Sentence

Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video Retrieval

1 code implementation3 Dec 2021 Fan Hu, Aozhu Chen, Ziyue Wang, Fangming Zhou, Jianfeng Dong, Xirong Li

In this paper we revisit feature fusion, an old-fashioned topic, in the new context of text-to-video retrieval.

 Ranked #1 on Ad-hoc video search on TRECVID-AVS20 (V3C1) (using extra training data)

Ad-hoc video search feature selection +3

Multi-Modal Multi-Instance Learning for Retinal Disease Recognition

no code implementations25 Sep 2021 Xirong Li, Yang Zhou, Jie Wang, Hailan Lin, Jianchun Zhao, Dayong Ding, Weihong Yu, Youxin Chen

We propose in this paper Multi-Modal Multi-Instance Learning (MM-MIL) for selectively fusing CFP and OCT modalities.

3D Object Detection for Autonomous Driving: A Survey

1 code implementation21 Jun 2021 Rui Qian, Xin Lai, Xirong Li

Autonomous driving is regarded as one of the most promising remedies to shield human beings from severe crashes.

3D Object Detection Attribute +5

Learning to Disentangle GAN Fingerprint for Fake Image Attribution

no code implementations16 Jun 2021 Tianyun Yang, Juan Cao, Qiang Sheng, Lei LI, Jiaqi Ji, Xirong Li, Sheng Tang

Adopting a multi-task framework, we propose a GAN Fingerprint Disentangling Network (GFD-Net) to simultaneously disentangle the fingerprint from GAN-generated images and produce a content-irrelevant representation for fake image attribution.

Fake Image Attribution Open-Ended Question Answering

BADet: Boundary-Aware 3D Object Detection from Point Clouds

1 code implementation21 Apr 2021 Rui Qian, Xin Lai, Xirong Li

Specifically, instead of refining each proposal independently as previous works do, we represent each proposal as a node for graph construction within a given cut-off threshold, associating proposals in the form of local neighborhood graph, with boundary correlations of an object being explicitly exploited.

3D Object Detection graph construction +3

Image Manipulation Detection by Multi-View Multi-Scale Supervision

2 code implementations ICCV 2021 Xinru Chen, Chengbo Dong, Jiaqi Ji, Juan Cao, Xirong Li

The key challenge of image manipulation detection is how to learn generalizable features that are sensitive to manipulations in novel data, whilst specific to prevent false alarms on authentic images.

Image Manipulation Image Manipulation Detection +3

Unsupervised Domain Expansion for Visual Categorization

2 code implementations1 Apr 2021 Jie Wang, Kaibin Tian, Dayong Ding, Gang Yang, Xirong Li

In this paper we extend UDA by proposing a new task called unsupervised domain expansion (UDE), which aims to adapt a deep model for the target domain with its unlabeled data, meanwhile maintaining the model's performance on the source domain.

Knowledge Distillation Unsupervised Domain Adaptation +1

Towards Annotation-Free Evaluation of Cross-Lingual Image Captioning

no code implementations9 Dec 2020 Aozhu Chen, Xinyi Huang, Hailan Lin, Xirong Li

For the first scenario with the references available, we propose two metrics, i. e., WMDRel and CLinRel.

Image Captioning Machine Translation +1

Learning Two-Stream CNN for Multi-Modal Age-related Macular Degeneration Categorization

1 code implementation3 Dec 2020 Weisen Wang, Xirong Li, Zhiyan Xu, Weihong Yu, Jianchun Zhao, Dayong Ding, Youxin Chen

Our MM-CNN is instantiated by a two-stream CNN, with spatially-invariant fusion to combine information from the CFP and OCT streams.

Data Augmentation Image-to-Image Translation

SEA: Sentence Encoder Assembly for Video Retrieval by Textual Queries

1 code implementation24 Nov 2020 Xirong Li, Fangming Zhou, Chaoxi Xu, Jiaqi Ji, Gang Yang

Inspired by the initial success of previously few works in combining multiple sentence encoders, this paper takes a step forward by developing a new and general method for effectively exploiting diverse sentence encoders.

Ranked #2 on Ad-hoc video search on TRECVID-AVS16 (IACC.3) (using extra training data)

Ad-hoc video search Management +6

Dual Encoding for Video Retrieval by Text

1 code implementation10 Sep 2020 Jianfeng Dong, Xirong Li, Chaoxi Xu, Xun Yang, Gang Yang, Xun Wang, Meng Wang

In this paper we achieve this by proposing a dual deep encoding network that encodes videos and queries into powerful dense representations of their own.

Ranked #3 on Ad-hoc video search on TRECVID-AVS16 (IACC.3) (using extra training data)

Ad-hoc video search Retrieval +2

Feature Re-Learning with Data Augmentation for Video Relevance Prediction

1 code implementation8 Apr 2020 Jianfeng Dong, Xun Wang, Leimin Zhang, Chaoxi Xu, Gang Yang, Xirong Li

Predicting the relevance between two given videos with respect to their visual content is a key component for content-based video recommendation and retrieval.

Data Augmentation Retrieval

iCap: Interactive Image Captioning with Predictive Text

no code implementations31 Jan 2020 Zhengxiong Jia, Xirong Li

In this paper we study a brand new topic of interactive image captioning with human in the loop.

Image Captioning Sentence +1

Hierarchical Attention Networks for Medical Image Segmentation

no code implementations20 Nov 2019 Fei Ding, Gang Yang, Jinlu Liu, Jun Wu, Dayong Ding, Jie Xv, Gangwei Cheng, Xirong Li

Unlike previous self-attention based methods that capture context information from one level, we reformulate the self-attention mechanism from the view of the high-order graph and propose a novel method, namely Hierarchical Attention Network (HANet), to address the problem of medical image segmentation.

Image Segmentation Medical Image Segmentation +2

Imagination Based Sample Construction for Zero-Shot Learning

no code implementations29 Oct 2018 Gang Yang, Jinlu Liu, Xirong Li

Different from these existing types of methods, we propose a new method: sample construction to deal with the problem of ZSL.

Attribute Image Retrieval +2

COCO-CN for Cross-Lingual Image Tagging, Captioning and Retrieval

2 code implementations22 May 2018 Xirong Li, Chaoxi Xu, Xiaoxu Wang, Weiyu Lan, Zhengxiong Jia, Gang Yang, Jieping Xu

This paper contributes to cross-lingual image annotation and retrieval in terms of data and baseline methods.


Cross-Media Similarity Evaluation for Web Image Retrieval in the Wild

1 code implementation5 Sep 2017 Jianfeng Dong, Xirong Li, Duanqing Xu

To quantify the current progress, we propose a simple text2image method, representing a novel test query by a set of images selected from large-scale query log.

Image Retrieval Retrieval

Predicting Visual Features from Text for Image and Video Caption Retrieval

1 code implementation5 Sep 2017 Jianfeng Dong, Xirong Li, Cees G. M. Snoek

This paper strives to find amidst a set of sentences the one best describing the content of a given image or video.

Retrieval Sentence +1

Fluency-Guided Cross-Lingual Image Captioning

1 code implementation15 Aug 2017 Weiyu Lan, Xirong Li, Jianfeng Dong

The framework comprises a module to automatically estimate the fluency of the sentences and another module to utilize the estimated fluency scores to effectively train an image captioning model for the target language.

Image Captioning

Detecting Adversarial Image Examples in Deep Networks with Adaptive Noise Reduction

2 code implementations23 May 2017 Bin Liang, Hongcheng Li, Miaoqiang Su, Xirong Li, Wenchang Shi, Xiao-Feng Wang

Consequently, the adversarial example can be effectively detected by comparing the classification results of a given sample and its denoised version, without referring to any prior knowledge of attacks.


Deep Text Classification Can be Fooled

no code implementations26 Apr 2017 Bin Liang, Hongcheng Li, Miaoqiang Su, Pan Bian, Xirong Li, Wenchang Shi

In this paper, we present an effective method to craft text adversarial samples, revealing one important yet underestimated fact that DNN-based text classifiers are also prone to adversarial sample attack.

General Classification text-classification +1

Improving Image Captioning by Concept-based Sentence Reranking

no code implementations3 May 2016 Xirong Li, Qin Jin

This paper describes our winning entry in the ImageCLEF 2015 image sentence generation task.

Image Captioning Language Modelling +1

Detecting Violence in Video using Subclasses

no code implementations27 Apr 2016 Xirong Li, Yujia Huo, Jieping Xu, Qin Jin

We enrich the MediaEval 2015 violence dataset by \emph{manually} labeling violence videos with respect to the subclasses.

Word2VisualVec: Image and Video to Sentence Matching by Visual Feature Prediction

no code implementations23 Apr 2016 Jianfeng Dong, Xirong Li, Cees G. M. Snoek

This paper strives to find the sentence best describing the content of an image or video.


TagBook: A Semantic Video Representation without Supervision for Event Detection

no code implementations10 Oct 2015 Masoud Mazloom, Xirong Li, Cees G. M. Snoek

We consider the problem of event detection in video for scenarios where only few, or even zero examples are available for training.

Event Detection Image Retrieval +2

Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement and Retrieval

1 code implementation28 Mar 2015 Xirong Li, Tiberio Uricchio, Lamberto Ballan, Marco Bertini, Cees G. M. Snoek, Alberto del Bimbo

Where previous reviews on content-based image retrieval emphasize on what can be seen in an image to bridge the semantic gap, this survey considers what people tag about an image.

Content-Based Image Retrieval Retrieval +1

Tag Relevance Fusion for Social Image Retrieval

no code implementations13 Oct 2014 Xirong Li

Due to the subjective nature of social tagging, measuring the relevance of social tags with respect to the visual content is crucial for retrieving the increasing amounts of social-networked images.

Image Retrieval Retrieval +1

Adaptive Tag Selection for Image Annotation

no code implementations17 Sep 2014 Xixi He, Xirong Li, Gang Yang, Jieping Xu, Qin Jin

The key insight is to divide the vocabulary into two disjoint subsets, namely a seen set consisting of tags having ground truth available for optimizing their thresholds and a novel set consisting of tags without any ground truth.


Cannot find the paper you are looking for? You can Submit a new open access paper.