Search Results for author: Jianping Fan

Found 31 papers, 10 papers with code

Image Clustering with External Guidance

no code implementations18 Oct 2023 Yunfan Li, Peng Hu, Dezhong Peng, Jiancheng Lv, Jianping Fan, Xi Peng

The core of clustering is incorporating prior knowledge to construct supervision signals.

Clustering Image Clustering

Multi-task Paired Masking with Alignment Modeling for Medical Vision-Language Pre-training

no code implementations13 May 2023 Ke Zhang, Yan Yang, Jun Yu, Hanliang Jiang, Jianping Fan, Qingming Huang, Weidong Han

To address this limitation, we propose a unified Med-VLP framework based on Multi-task Paired Masking with Alignment (MPMA) to integrate the cross-modal alignment task into the joint image-text reconstruction framework to achieve more comprehensive cross-modal interaction, while a Global and Local Alignment (GLA) module is designed to assist self-supervised paradigm in obtaining semantic representations with rich domain knowledge.

ANetQA: A Large-scale Benchmark for Fine-grained Compositional Reasoning over Untrimmed Videos

1 code implementation CVPR 2023 Zhou Yu, Lixiang Zheng, Zhou Zhao, Fei Wu, Jianping Fan, Kui Ren, Jun Yu

A recent benchmark AGQA poses a promising paradigm to generate QA pairs automatically from pre-annotated scene graphs, enabling it to measure diverse reasoning abilities with granular control.

Question Answering Spatio-temporal Scene Graphs +1

GLOW: Global Layout Aware Attacks on Object Detection

no code implementations27 Feb 2023 Buyu Liu, BaoJun, Jianping Fan, Xi Peng, Kui Ren, Jun Yu

More desired attacks, to this end, should be able to fool defenses with such consistency checks.

object-detection Object Detection

Cross-domain recommendation via user interest alignment

no code implementations26 Jan 2023 Chuang Zhao, Hongke Zhao, Ming He, Jian Zhang, Jianping Fan

Specifically, we first construct a unified cross-domain heterogeneous graph and redefine the message passing mechanism of graph convolutional networks to capture high-order similarity of users and items across domains.

Recommendation Systems

Learning to Learn Domain-invariant Parameters for Domain Generalization

no code implementations4 Nov 2022 Feng Hou, Yao Zhang, Yang Liu, Jin Yuan, Cheng Zhong, Yang Zhang, Zhongchao shi, Jianping Fan, Zhiqiang He

Due to domain shift, deep neural networks (DNNs) usually fail to generalize well on unknown test data in practice.

Domain Generalization

SAP-DETR: Bridging the Gap Between Salient Points and Queries-Based Transformer Detector for Fast Model Convergency

1 code implementation CVPR 2023 Yang Liu, Yao Zhang, Yixin Wang, Yang Zhang, Jiang Tian, Zhongchao shi, Jianping Fan, Zhiqiang He

To bridge the gap between the reference points of salient queries and Transformer detectors, we propose SAlient Point-based DETR (SAP-DETR) by treating object detection as a transformation from salient points to instance objects.

object-detection Object Detection

Self-Supervised Graph Neural Network for Multi-Source Domain Adaptation

no code implementations8 Apr 2022 Jin Yuan, Feng Hou, Yangzhou Du, Zhongchao shi, Xin Geng, Jianping Fan, Yong Rui

Domain adaptation (DA) tries to tackle the scenarios when the test data does not fully follow the same distribution of the training data, and multi-source domain adaptation (MSDA) is very attractive for real world applications.

Domain Adaptation Self-Supervised Learning

Bilaterally Slimmable Transformer for Elastic and Efficient Visual Question Answering

1 code implementation24 Mar 2022 Zhou Yu, Zitian Jin, Jun Yu, Mingliang Xu, Hongbo Wang, Jianping Fan

Recent advances in Transformer architectures [1] have brought remarkable improvements to visual question answering (VQA).

Question Answering Visual Question Answering

Graph Attention Transformer Network for Multi-Label Image Classification

no code implementations8 Mar 2022 Jin Yuan, Shikai Chen, Yao Zhang, Zhongchao shi, Xin Geng, Jianping Fan, Yong Rui

Subsequently, we design the graph attention transformer layer to transfer this adjacency matrix to adapt to the current domain.

Classification Graph Attention +2

A Survey of Visual Transformers

1 code implementation11 Nov 2021 Yang Liu, Yao Zhang, Yixin Wang, Feng Hou, Jin Yuan, Jiang Tian, Yang Zhang, Zhongchao shi, Jianping Fan, Zhiqiang He

Transformer, an attention-based encoder-decoder model, has already revolutionized the field of natural language processing (NLP).

ACN: Adversarial Co-training Network for Brain Tumor Segmentation with Missing Modalities

2 code implementations28 Jun 2021 Yixin Wang, Yang Zhang, Yang Liu, Zihao Lin, Jiang Tian, Cheng Zhong, Zhongchao shi, Jianping Fan, Zhiqiang He

Specifically, ACN adopts a novel co-training network, which enables a coupled learning process for both full modality and missing modality to supplement each other's domain and feature representations, and more importantly, to recover the `missing' information of absent modalities.

Brain Tumor Segmentation Transfer Learning +1

Trust It or Not: Confidence-Guided Automatic Radiology Report Generation

no code implementations21 Jun 2021 Yixin Wang, Zihao Lin, Zhe Xu, Haoyu Dong, Jiang Tian, Jie Luo, Zhongchao shi, Yang Zhang, Jianping Fan, Zhiqiang He

Experimental results have demonstrated that the proposed method for model uncertainty characterization and estimation can produce more reliable confidence scores for radiology report generation, and the modified loss function, which takes into account the uncertainties, leads to better model performance on two public radiology report datasets.

Decision Making Image Captioning +1

Cluster-level Feature Alignment for Person Re-identification

1 code implementation15 Aug 2020 Qiuyu Chen, Wei zhang, Jianping Fan

Instance-level alignment is widely exploited for person re-identification, e. g. spatial alignment, latent semantic alignment and triplet alignment.

Person Re-Identification

Automatic Image Labelling at Pixel Level

no code implementations15 Jul 2020 Xiang Zhang, Wei zhang, Jinye Peng, Jianping Fan

A Guided Filter Network (GFN) is first developed to learn the segmentation knowledge from a source domain, and such GFN then transfers such segmentation knowledge to generate coarse object masks in the target domain.

Image Segmentation Segmentation +1

Boundary-aware Context Neural Network for Medical Image Segmentation

no code implementations3 May 2020 Ruxin Wang, Shuyuan Chen, Chaojie Ji, Jianping Fan, Ye Li

In this paper, we formulate a boundary-aware context neural network (BA-Net) for 2D medical image segmentation to capture richer context and preserve fine spatial information.

Image Segmentation Medical Image Segmentation +3

Adaptive Fractional Dilated Convolution Network for Image Aesthetics Assessment

no code implementations CVPR 2020 Qiuyu Chen, Wei zhang, Ning Zhou, Peng Lei, Yi Xu, Yu Zheng, Jianping Fan

Specifically, the fractional dilated kernel is adaptively constructed according to the image aspect ratios, where the interpolation of nearest two integers dilated kernels is used to cope with the misalignment of fractional sampling.

MOD: A Deep Mixture Model with Online Knowledge Distillation for Large Scale Video Temporal Concept Localization

1 code implementation27 Oct 2019 Rongcheng Lin, Jing Xiao, Jianping Fan

In this paper, we present and discuss a deep mixture model with online knowledge distillation (MOD) for large-scale video temporal concept localization, which is ranked 3rd in the 3rd YouTube-8M Video Understanding Challenge.

Knowledge Distillation Video Understanding

Imitating Targets from all sides: An Unsupervised Transfer Learning method for Person Re-identification

no code implementations10 Apr 2019 Jiajie Tian, Zhu Teng, Rui Li, Yan Li, Baopeng Zhang, Jianping Fan

Person re-identification (Re-ID) models usually show a limited performance when they are trained on one dataset and tested on another dataset due to the inter-dataset bias (e. g. completely different identities and backgrounds) and the intra-dataset difference (e. g. camera invariance).

Person Re-Identification Transfer Learning

Learning Competitive and Discriminative Reconstructions for Anomaly Detection

no code implementations17 Mar 2019 Kai Tian, Shuigeng Zhou, Jianping Fan, Jihong Guan

Most of the existing methods for anomaly detection use only positive data to learn the data distribution, thus they usually need a pre-defined threshold at the detection stage to determine whether a test instance is an outlier.

Anomaly Detection Test

NeXtVLAD: An Efficient Neural Network to Aggregate Frame-level Features for Large-scale Video Classification

1 code implementation12 Nov 2018 Rongcheng Lin, Jing Xiao, Jianping Fan

This paper introduces a fast and efficient network architecture, NeXtVLAD, to aggregate frame-level features into a compact feature vector for large-scale video classification.

Efficient Neural Network General Classification +2

Deep Boosting of Diverse Experts

no code implementations ICLR 2018 Wei Zhang, Qiuyu Chen, Jun Yu, Jianping Fan

In this paper, a deep boosting algorithm is developed to learn more discriminative ensemble classifier by seamlessly combining a set of base deep CNNs (base experts) with diverse capabilities, e. g., these base deep CNNs are sequentially trained to recognize a set of object classes in an easy-to-hard way according to their learning complexities.

Object Recognition

Beyond Bilinear: Generalized Multimodal Factorized High-order Pooling for Visual Question Answering

2 code implementations10 Aug 2017 Zhou Yu, Jun Yu, Chenchao Xiang, Jianping Fan, DaCheng Tao

For fine-grained image and question representations, a `co-attention' mechanism is developed by using a deep neural network architecture to jointly learn the attentions for both the image and the question, which can allow us to reduce the irrelevant features effectively and obtain more discriminative features for image and question representations.

Question Answering Visual Question Answering +1

Multi-modal Factorized Bilinear Pooling with Co-Attention Learning for Visual Question Answering

6 code implementations ICCV 2017 Zhou Yu, Jun Yu, Jianping Fan, DaCheng Tao

For multi-modal feature fusion, here we develop a Multi-modal Factorized Bilinear (MFB) pooling approach to efficiently and effectively combine multi-modal features, which results in superior performance for VQA compared with other bilinear pooling approaches.

Question Answering Visual Question Answering

Embedding Visual Hierarchy with Deep Networks for Large-Scale Visual Recognition

no code implementations8 Jul 2017 Tianyi Zhao, Baopeng Zhang, Wei zhang, Ning Zhou, Jun Yu, Jianping Fan

Our LMM model can provide an end-to-end approach for jointly learning: (a) the deep networks to extract more discriminative deep features for image and object class representation; (b) the tree classifier for recognizing large numbers of object classes hierarchically; and (c) the visual hierarchy adaptation for achieving more accurate indexing of large numbers of object classes hierarchically.

Object Recognition

Deep Mixture of Diverse Experts for Large-Scale Visual Recognition

no code implementations24 Jun 2017 Tianyi Zhao, Jun Yu, Zhenzhong Kuang, Wei zhang, Jianping Fan

In this paper, a deep mixture of diverse experts algorithm is developed for seamlessly combining a set of base deep CNNs (convolutional neural networks) with diverse outputs (task spaces), e. g., such base deep CNNs are trained to recognize different subsets of tens of thousands of atomic object classes.

Multi-Task Learning Object Recognition

Re-ranking Object Proposals for Object Detection in Automatic Driving

no code implementations19 May 2016 Zhun Zhong, Mingyi Lei, Shaozi Li, Jianping Fan

In this paper, we propose a semantic, class-specific approach to re-rank object proposals, which can consistently improve the recall performance even with less proposals.

object-detection Object Detection +2

Efficiently Detecting Overlapping Communities through Seeding and Semi-Supervised Learning

no code implementations23 Jan 2014 Changxing Shang, Shengzhong Feng, Zhongying Zhao, Jianping Fan

This paper proposes a new method that transforms a network into a corpus where each edge is treated as a document, and all nodes of the network are treated as terms of the corpus.

Clustering Community Detection

Correlative Multi-Label Multi-Instance Image Annotation

no code implementations IEEE International Conference on Computer Vision 2011 Xiangyang Xue, Wei zhang, Jie Zhang, Bin Wu, Jianping Fan, Yao Lu

The cross-level label coherence en-codes the consistency between the labels at the image leveland the labels at the region level.

Cannot find the paper you are looking for? You can Submit a new open access paper.