Search Results for author: Xiaobo Zhang

Found 12 papers, 6 papers with code

Anatomical Structure-Guided Medical Vision-Language Pre-training

no code implementations14 Mar 2024 Qingqiu Li, Xiaohan Yan, Jilan Xu, Runtian Yuan, Yuejie Zhang, Rui Feng, Quanli Shen, Xiaobo Zhang, Shujun Wang

For finding and existence, we regard them as image tags, applying an image-tag recognition decoder to associate image features with their respective tags within each sample and constructing soft labels for contrastive learning to improve the semantic association of different image-report pairs.

Contrastive Learning Representation Learning +2

Let All be Whitened: Multi-teacher Distillation for Efficient Visual Retrieval

1 code implementation15 Dec 2023 Zhe Ma, Jianfeng Dong, Shouling Ji, Zhenguang Liu, Xuhong Zhang, Zonghui Wang, Sifeng He, Feng Qian, Xiaobo Zhang, Lei Yang

Instead of crafting a new method pursuing further improvement on accuracy, in this paper we propose a multi-teacher distillation framework Whiten-MTD, which is able to transfer knowledge from off-the-shelf pre-trained retrieval models to a lightweight student model for efficient visual retrieval.

Image Retrieval Retrieval +1

Large Language Models are Complex Table Parsers

no code implementations13 Dec 2023 Bowen Zhao, Changkai Ji, Yuejie Zhang, Wen He, Yingwen Wang, Qing Wang, Rui Feng, Xiaobo Zhang

With the Generative Pre-trained Transformer 3. 5 (GPT-3. 5) exhibiting remarkable reasoning and comprehension abilities in Natural Language Processing (NLP), most Question Answering (QA) research has primarily centered around general QA tasks based on GPT, neglecting the specific challenges posed by Complex Table QA.

Logical Reasoning Question Answering

Enhanced Knowledge Injection for Radiology Report Generation

no code implementations1 Nov 2023 Qingqiu Li, Jilan Xu, Runtian Yuan, Mohan Chen, Yuejie Zhang, Rui Feng, Xiaobo Zhang, Shang Gao

Automatic generation of radiology reports holds crucial clinical value, as it can alleviate substantial workload on radiologists and remind less experienced ones of potential anomalies.

Image Captioning Retrieval

Learning Segment Similarity and Alignment in Large-Scale Content Based Video Retrieval

no code implementations20 Sep 2023 Chen Jiang, Kaiming Huang, Sifeng He, Xudong Yang, Wei zhang, Xiaobo Zhang, Yuan Cheng, Lei Yang, Qing Wang, Furong Xu, Tan Pan, Wei Chu

SSAN is based on two newly proposed modules in video retrieval: (1) An efficient Self-supervised Keyframe Extraction (SKE) module to reduce redundant frame features, (2) A robust Similarity Pattern Detection (SPD) module for temporal alignment.

Retrieval Video Retrieval

Video Infringement Detection via Feature Disentanglement and Mutual Information Maximization

1 code implementation13 Sep 2023 Zhenguang Liu, Xinyang Yu, Ruili Wang, Shuai Ye, Zhe Ma, Jianfeng Dong, Sifeng He, Feng Qian, Xiaobo Zhang, Roger Zimmermann, Lei Yang

We theoretically analyzed the mutual information between the label and the disentangled features, arriving at a loss that maximizes the extraction of task-relevant information from the original feature.

Disentanglement

Web Photo Source Identification based on Neural Enhanced Camera Fingerprint

1 code implementation18 Feb 2023 Feng Qian, Sifeng He, Honghao Huang, Huanyu Ma, Xiaobo Zhang, Lei Yang

With the growing popularity of smartphone photography in recent years, web photos play an increasingly important role in all walks of life.

Metric Learning

CMC v2: Towards More Accurate COVID-19 Detection with Discriminative Video Priors

no code implementations26 Nov 2022 Junlin Hou, Jilan Xu, Nan Zhang, Yi Wang, Yuejie Zhang, Xiaobo Zhang, Rui Feng

This paper presents our solution for the 2nd COVID-19 Competition, occurring in the framework of the AIMIA Workshop at the European Conference on Computer Vision (ECCV 2022).

COVID-19 Diagnosis Representation Learning

TransVCL: Attention-enhanced Video Copy Localization Network with Flexible Supervision

2 code implementations23 Nov 2022 Sifeng He, Yue He, Minlong Lu, Chen Jiang, Xudong Yang, Feng Qian, Xiaobo Zhang, Lei Yang, Jiandong Zhang

Previous methods typically start from frame-to-frame similarity matrix generated by cosine similarity between frame-level features of the input video pair, and then detect and refine the boundaries of copied segments on similarity matrix under temporal constraints.

Retrieval Video Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.