no code implementations • 18 Oct 2023 • Yunfan Li, Peng Hu, Dezhong Peng, Jiancheng Lv, Jianping Fan, Xi Peng
The core of clustering is incorporating prior knowledge to construct supervision signals.
no code implementations • 30 May 2023 • Jin Yuan, Yang Zhang, Yangzhou Du, Zhongchao shi, Xin Geng, Jianping Fan, Yong Rui
In recent years, deep models have achieved remarkable success in many vision tasks.
no code implementations • 13 May 2023 • Ke Zhang, Yan Yang, Jun Yu, Hanliang Jiang, Jianping Fan, Qingming Huang, Weidong Han
To address this limitation, we propose a unified Med-VLP framework based on Multi-task Paired Masking with Alignment (MPMA) to integrate the cross-modal alignment task into the joint image-text reconstruction framework to achieve more comprehensive cross-modal interaction, while a Global and Local Alignment (GLA) module is designed to assist self-supervised paradigm in obtaining semantic representations with rich domain knowledge.
1 code implementation • CVPR 2023 • Zhou Yu, Lixiang Zheng, Zhou Zhao, Fei Wu, Jianping Fan, Kui Ren, Jun Yu
A recent benchmark AGQA poses a promising paradigm to generate QA pairs automatically from pre-annotated scene graphs, enabling it to measure diverse reasoning abilities with granular control.
no code implementations • 27 Feb 2023 • Buyu Liu, BaoJun, Jianping Fan, Xi Peng, Kui Ren, Jun Yu
More desired attacks, to this end, should be able to fool defenses with such consistency checks.
no code implementations • 26 Jan 2023 • Chuang Zhao, Hongke Zhao, Ming He, Jian Zhang, Jianping Fan
Specifically, we first construct a unified cross-domain heterogeneous graph and redefine the message passing mechanism of graph convolutional networks to capture high-order similarity of users and items across domains.
no code implementations • ICCV 2023 • Jiangming Shi, Yachao Zhang, Xiangbo Yin, Yuan Xie, Zhizhong Zhang, Jianping Fan, Zhongchao shi, Yanyun Qu
Visible-infrared person re-identification (VI-ReID) aims to match a specific person from a gallery of images captured from non-overlapping visible and infrared cameras.
no code implementations • 4 Nov 2022 • Feng Hou, Yao Zhang, Yang Liu, Jin Yuan, Cheng Zhong, Yang Zhang, Zhongchao shi, Jianping Fan, Zhiqiang He
Due to domain shift, deep neural networks (DNNs) usually fail to generalize well on unknown test data in practice.
1 code implementation • CVPR 2023 • Yang Liu, Yao Zhang, Yixin Wang, Yang Zhang, Jiang Tian, Zhongchao shi, Jianping Fan, Zhiqiang He
To bridge the gap between the reference points of salient queries and Transformer detectors, we propose SAlient Point-based DETR (SAP-DETR) by treating object detection as a transformation from salient points to instance objects.
no code implementations • 8 Apr 2022 • Jin Yuan, Feng Hou, Yangzhou Du, Zhongchao shi, Xin Geng, Jianping Fan, Yong Rui
Domain adaptation (DA) tries to tackle the scenarios when the test data does not fully follow the same distribution of the training data, and multi-source domain adaptation (MSDA) is very attractive for real world applications.
1 code implementation • 24 Mar 2022 • Zhou Yu, Zitian Jin, Jun Yu, Mingliang Xu, Hongbo Wang, Jianping Fan
Recent advances in Transformer architectures [1] have brought remarkable improvements to visual question answering (VQA).
no code implementations • 8 Mar 2022 • Jin Yuan, Shikai Chen, Yao Zhang, Zhongchao shi, Xin Geng, Jianping Fan, Yong Rui
Subsequently, we design the graph attention transformer layer to transfer this adjacency matrix to adapt to the current domain.
1 code implementation • 11 Nov 2021 • Yang Liu, Yao Zhang, Yixin Wang, Feng Hou, Jin Yuan, Jiang Tian, Yang Zhang, Zhongchao shi, Jianping Fan, Zhiqiang He
Transformer, an attention-based encoder-decoder model, has already revolutionized the field of natural language processing (NLP).
2 code implementations • 28 Jun 2021 • Yixin Wang, Yang Zhang, Yang Liu, Zihao Lin, Jiang Tian, Cheng Zhong, Zhongchao shi, Jianping Fan, Zhiqiang He
Specifically, ACN adopts a novel co-training network, which enables a coupled learning process for both full modality and missing modality to supplement each other's domain and feature representations, and more importantly, to recover the `missing' information of absent modalities.
no code implementations • 21 Jun 2021 • Yixin Wang, Zihao Lin, Zhe Xu, Haoyu Dong, Jiang Tian, Jie Luo, Zhongchao shi, Yang Zhang, Jianping Fan, Zhiqiang He
Experimental results have demonstrated that the proposed method for model uncertainty characterization and estimation can produce more reliable confidence scores for radiology report generation, and the modified loss function, which takes into account the uncertainties, leads to better model performance on two public radiology report datasets.
1 code implementation • 15 Aug 2020 • Qiuyu Chen, Wei zhang, Jianping Fan
Instance-level alignment is widely exploited for person re-identification, e. g. spatial alignment, latent semantic alignment and triplet alignment.
Ranked #30 on
Person Re-Identification
on DukeMTMC-reID
no code implementations • 15 Jul 2020 • Xiang Zhang, Wei zhang, Jinye Peng, Jianping Fan
A Guided Filter Network (GFN) is first developed to learn the segmentation knowledge from a source domain, and such GFN then transfers such segmentation knowledge to generate coarse object masks in the target domain.
no code implementations • 3 May 2020 • Ruxin Wang, Shuyuan Chen, Chaojie Ji, Jianping Fan, Ye Li
In this paper, we formulate a boundary-aware context neural network (BA-Net) for 2D medical image segmentation to capture richer context and preserve fine spatial information.
no code implementations • CVPR 2020 • Qiuyu Chen, Wei zhang, Ning Zhou, Peng Lei, Yi Xu, Yu Zheng, Jianping Fan
Specifically, the fractional dilated kernel is adaptively constructed according to the image aspect ratios, where the interpolation of nearest two integers dilated kernels is used to cope with the misalignment of fractional sampling.
1 code implementation • 27 Oct 2019 • Rongcheng Lin, Jing Xiao, Jianping Fan
In this paper, we present and discuss a deep mixture model with online knowledge distillation (MOD) for large-scale video temporal concept localization, which is ranked 3rd in the 3rd YouTube-8M Video Understanding Challenge.
no code implementations • 10 Apr 2019 • Jiajie Tian, Zhu Teng, Rui Li, Yan Li, Baopeng Zhang, Jianping Fan
Person re-identification (Re-ID) models usually show a limited performance when they are trained on one dataset and tested on another dataset due to the inter-dataset bias (e. g. completely different identities and backgrounds) and the intra-dataset difference (e. g. camera invariance).
no code implementations • 17 Mar 2019 • Kai Tian, Shuigeng Zhou, Jianping Fan, Jihong Guan
Most of the existing methods for anomaly detection use only positive data to learn the data distribution, thus they usually need a pre-defined threshold at the detection stage to determine whether a test instance is an outlier.
1 code implementation • 12 Nov 2018 • Rongcheng Lin, Jing Xiao, Jianping Fan
This paper introduces a fast and efficient network architecture, NeXtVLAD, to aggregate frame-level features into a compact feature vector for large-scale video classification.
no code implementations • ICLR 2018 • Wei Zhang, Qiuyu Chen, Jun Yu, Jianping Fan
In this paper, a deep boosting algorithm is developed to learn more discriminative ensemble classifier by seamlessly combining a set of base deep CNNs (base experts) with diverse capabilities, e. g., these base deep CNNs are sequentially trained to recognize a set of object classes in an easy-to-hard way according to their learning complexities.
2 code implementations • 10 Aug 2017 • Zhou Yu, Jun Yu, Chenchao Xiang, Jianping Fan, DaCheng Tao
For fine-grained image and question representations, a `co-attention' mechanism is developed by using a deep neural network architecture to jointly learn the attentions for both the image and the question, which can allow us to reduce the irrelevant features effectively and obtain more discriminative features for image and question representations.
6 code implementations • ICCV 2017 • Zhou Yu, Jun Yu, Jianping Fan, DaCheng Tao
For multi-modal feature fusion, here we develop a Multi-modal Factorized Bilinear (MFB) pooling approach to efficiently and effectively combine multi-modal features, which results in superior performance for VQA compared with other bilinear pooling approaches.
no code implementations • 8 Jul 2017 • Tianyi Zhao, Baopeng Zhang, Wei zhang, Ning Zhou, Jun Yu, Jianping Fan
Our LMM model can provide an end-to-end approach for jointly learning: (a) the deep networks to extract more discriminative deep features for image and object class representation; (b) the tree classifier for recognizing large numbers of object classes hierarchically; and (c) the visual hierarchy adaptation for achieving more accurate indexing of large numbers of object classes hierarchically.
no code implementations • 24 Jun 2017 • Tianyi Zhao, Jun Yu, Zhenzhong Kuang, Wei zhang, Jianping Fan
In this paper, a deep mixture of diverse experts algorithm is developed for seamlessly combining a set of base deep CNNs (convolutional neural networks) with diverse outputs (task spaces), e. g., such base deep CNNs are trained to recognize different subsets of tens of thousands of atomic object classes.
no code implementations • 19 May 2016 • Zhun Zhong, Mingyi Lei, Shaozi Li, Jianping Fan
In this paper, we propose a semantic, class-specific approach to re-rank object proposals, which can consistently improve the recall performance even with less proposals.
no code implementations • 23 Jan 2014 • Changxing Shang, Shengzhong Feng, Zhongying Zhao, Jianping Fan
This paper proposes a new method that transforms a network into a corpus where each edge is treated as a document, and all nodes of the network are treated as terms of the corpus.
no code implementations • IEEE International Conference on Computer Vision 2011 • Xiangyang Xue, Wei zhang, Jie Zhang, Bin Wu, Jianping Fan, Yao Lu
The cross-level label coherence en-codes the consistency between the labels at the image leveland the labels at the region level.