Search Results for author: Xiaoshan Yang

Found 13 papers, 3 papers with code

Exploring Multi-Modal Contextual Knowledge for Open-Vocabulary Object Detection

no code implementations30 Aug 2023 Yifan Xu, Mengdan Zhang, Xiaoshan Yang, Changsheng Xu

In this paper, we for the first time explore helpful multi-modal contextual knowledge to understand novel categories for open-vocabulary object detection (OVD).

Knowledge Distillation Language Modelling +4

Multi-modal Queried Object Detection in the Wild

1 code implementation NeurIPS 2023 Yifan Xu, Mengdan Zhang, Chaoyou Fu, Peixian Chen, Xiaoshan Yang, Ke Li, Changsheng Xu

To address the learning inertia problem brought by the frozen detector, a vision conditioned masked language prediction strategy is proposed.

Few-Shot Object Detection Object +2

CLIP-VG: Self-paced Curriculum Adapting of CLIP for Visual Grounding

1 code implementation15 May 2023 Linhui Xiao, Xiaoshan Yang, Fang Peng, Ming Yan, YaoWei Wang, Changsheng Xu

In order to utilize vision and language pre-trained models to address the grounding problem, and reasonably take advantage of pseudo-labels, we propose CLIP-VG, a novel method that can conduct self-paced curriculum adapting of CLIP with pseudo-language labels.

Transfer Learning Visual Grounding

Active Exploration of Multimodal Complementarity for Few-Shot Action Recognition

no code implementations CVPR 2023 Yuyang Wanyan, Xiaoshan Yang, Chaofan Chen, Changsheng Xu

In meta-training, we design an Active Sample Selection (ASS) module to organize query samples with large differences in the reliability of modalities into different groups based on modality-specific posterior distributions.

Few-Shot action recognition Few Shot Action Recognition +2

SgVA-CLIP: Semantic-guided Visual Adapting of Vision-Language Models for Few-shot Image Classification

no code implementations28 Nov 2022 Fang Peng, Xiaoshan Yang, Linhui Xiao, YaoWei Wang, Changsheng Xu

Although significant progress has been made in few-shot learning, most of existing few-shot image classification methods require supervised pre-training on a large amount of samples of base classes, which limits their generalization ability in real world application.

Few-Shot Image Classification Few-Shot Learning +2

Shifting More Attention to Visual Backbone: Query-modulated Refinement Networks for End-to-End Visual Grounding

1 code implementation CVPR 2022 Jiabo Ye, Junfeng Tian, Ming Yan, Xiaoshan Yang, Xuwu Wang, Ji Zhang, Liang He, Xin Lin

Moreover, since the backbones are query-agnostic, it is difficult to completely avoid the inconsistency issue by training the visual backbone end-to-end in the visual grounding framework.

Multimodal Reasoning Visual Grounding

Dynamic Scene Graph Generation via Anticipatory Pre-Training

no code implementations CVPR 2022 Yiming Li, Xiaoshan Yang, Changsheng Xu

Humans can not only see the collection of objects in visual scenes, but also identify the relationship between objects.

Graph Generation Scene Graph Generation

Dynamic Hypergraph Convolutional Networks for Skeleton-Based Action Recognition

no code implementations20 Dec 2021 Jinfeng Wei, Yunxin Wang, Mengli Guo, Pei Lv, Xiaoshan Yang, Mingliang Xu

Graph convolutional networks (GCNs) based methods have achieved advanced performance on skeleton-based action recognition task.

Action Recognition Skeleton Based Action Recognition

ECKPN: Explicit Class Knowledge Propagation Network for Transductive Few-shot Learning

no code implementations CVPR 2021 Chaofan Chen, Xiaoshan Yang, Changsheng Xu, Xuhui Huang, Zhe Ma

Specifically, we first employ the comparison module to explore the pairwise sample relations to learn rich sample representations in the instance-level graph.

Few-Shot Learning

Health Status Prediction with Local-Global Heterogeneous Behavior Graph

no code implementations23 Mar 2021 Xuan Ma, Xiaoshan Yang, Junyu Gao, Changsheng Xu

However, these data streams are multi-source and heterogeneous, containing complex temporal structures with local contextual and global temporal aspects, which makes the feature learning and data joint utilization challenging.

Management

Data--driven Image Restoration with Option--driven Learning for Big and Small Astronomical Image Datasets

no code implementations7 Nov 2020 Peng Jia, Ruiyu Ning, Ruiqi Sun, Xiaoshan Yang, Dongmei Cai

In recent years, developments of deep neural networks and increments of the number of astronomical images have evoked a lot of data--driven image restoration methods.

Image Restoration

Time-Guided High-Order Attention Model of Longitudinal Heterogeneous Healthcare Data

no code implementations28 Nov 2019 Yi Huang, Xiaoshan Yang, Changsheng Xu

(1) It can model longitudinal heterogeneous EHRs data via capturing the 3-order correlations of different modalities and the irregular temporal impact of historical events.

Management Mortality Prediction +1

Cannot find the paper you are looking for? You can Submit a new open access paper.