Search Results for author: Shan Zhang

Found 15 papers, 6 papers with code

VRP-SAM: SAM with Visual Reference Prompt

1 code implementation27 Feb 2024 Yanpeng Sun, Jiahui Chen, Shan Zhang, Xinyu Zhang, Qiang Chen, Gang Zhang, Errui Ding, Jingdong Wang, Zechao Li

In this paper, we propose a novel Visual Reference Prompt (VRP) encoder that empowers the Segment Anything Model (SAM) to utilize annotated reference images as prompts for segmentation, creating the VRP-SAM model.

Meta-Learning Segmentation

Semantic-Aware Autoregressive Image Modeling for Visual Representation Learning

1 code implementation16 Dec 2023 Kaiyou Song, Shan Zhang, Tong Wang

In this study, inspired by human beings' way of grasping an image, i. e., focusing on the main object first, we present a semantic-aware autoregressive image modeling (SemAIM) method to tackle this challenge.

Image Classification Instance Segmentation +4

Multi-Mode Online Knowledge Distillation for Self-Supervised Visual Representation Learning

1 code implementation CVPR 2023 Kaiyou Song, Jin Xie, Shan Zhang, Zimeng Luo

Different from existing SSL-KD methods that transfer knowledge from a static pre-trained teacher to a student, in MOKD, two different models learn collaboratively in a self-supervised manner.

Knowledge Distillation Representation Learning +1

s-Adaptive Decoupled Prototype for Few-Shot Object Detection

no code implementations ICCV 2023 Jinhao Du, Shan Zhang, Qiang Chen, Haifeng Le, Yanpeng Sun, Yao Ni, Jian Wang, Bin He, Jingdong Wang

To provide precise information for the query image, the prototype is decoupled into task-specific ones, which provide tailored guidance for 'where to look' and 'what to look for', respectively.

Few-Shot Object Detection Meta-Learning +3

Semantics-Consistent Feature Search for Self-Supervised Visual Representation Learning

no code implementations ICCV 2023 Kaiyou Song, Shan Zhang, Zihao An, Zimeng Luo, Tong Wang, Jin Xie

In contrastive self-supervised learning, the common way to learn discriminative representation is to pull different augmented "views" of the same image closer while pushing all other images further apart, which has been proven to be effective.

Representation Learning Self-Supervised Learning

Group DETR v2: Strong Object Detector with Encoder-Decoder Pretraining

no code implementations arXiv 2022 Qiang Chen, Jian Wang, Chuchu Han, Shan Zhang, Zexian Li, Xiaokang Chen, Jiahui Chen, Xiaodi Wang, Shuming Han, Gang Zhang, Haocheng Feng, Kun Yao, Junyu Han, Errui Ding, Jingdong Wang

The training process consists of self-supervised pretraining and finetuning a ViT-Huge encoder on ImageNet-1K, pretraining the detector on Object365, and finally finetuning it on COCO.

Decoder Object +2

Time-rEversed diffusioN tEnsor Transformer: A new TENET of Few-Shot Object Detection

1 code implementation30 Oct 2022 Shan Zhang, Naila Murray, Lei Wang, Piotr Koniusz

To address these drawbacks, we propose a Time-rEversed diffusioN tEnsor Transformer (TENET), which i) forms high-order tensor representations that capture multi-way feature occurrences that are highly discriminative, and ii) uses a transformer that dynamically extracts correlations between the query image and the entire support set, instead of a single average-pooled support embedding.

Few-Shot Object Detection Object +1

Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment

2 code implementations ICCV 2023 Qiang Chen, Xiaokang Chen, Jian Wang, Shan Zhang, Kun Yao, Haocheng Feng, Junyu Han, Errui Ding, Gang Zeng, Jingdong Wang

Detection transformer (DETR) relies on one-to-one assignment, assigning one ground-truth object to one prediction, for end-to-end detection without NMS post-processing.

Data Augmentation Decoder +3

A Neural Network Based Method with Transfer Learning for Genetic Data Analysis

no code implementations20 Jun 2022 Jinghang Lin, Shan Zhang, Qing Lu

Transfer learning has emerged as a powerful technique in many application problems, such as computer vision and natural language processing.

Transfer Learning

CATrans: Context and Affinity Transformer for Few-Shot Segmentation

no code implementations27 Apr 2022 Shan Zhang, Tianyi Wu, Sitong Wu, Guodong Guo

In this work, we effectively integrate the context and affinity information via the proposed novel Context and Affinity Transformer (CATrans) in a hierarchical architecture.

Relation Transfer Learning

Distributed Estimation in Large Scale Wireless Sensor Networks via a Two Step Group-based Approach

no code implementations17 Mar 2022 Shan Zhang, Pranay Sharma, Baocheng Geng, Pramod K. Varshney

To achieve greater sensor transmission and estimation efficiency, we propose a two step group-based collaborative distributed estimation scheme, where in the first step, sensors form dependence driven groups such that sensors in the same group are highly dependent, while sensors from different groups are independent, and perform a copula-based maximum a posteriori probability (MAP) estimation via intragroup collaboration.

Change Detection from Synthetic Aperture Radar Images via Graph-Based Knowledge Supplement Network

1 code implementation22 Jan 2022 Junjie Wang, Feng Gao, Junyu Dong, Shan Zhang, Qian Du

Synthetic aperture radar (SAR) image change detection is a vital yet challenging task in the field of remote sensing image analysis.

Change Detection Feature Correlation

Kernelized Few-Shot Object Detection With Efficient Integral Aggregation

no code implementations CVPR 2022 Shan Zhang, Lei Wang, Naila Murray, Piotr Koniusz

We design a Kernelized Few-shot Object Detector by leveraging kernelized matrices computed over multiple proposal regions, which yield expressive non-linear representations whose model complexity is learned on the fly.

Few-Shot Object Detection Object +2

Why Interpretability in Machine Learning? An Answer Using Distributed Detection and Data Fusion Theory

no code implementations25 Jun 2018 Kush R. Varshney, Prashant Khanduri, Pranay Sharma, Shan Zhang, Pramod K. Varshney

Such arguments, however, fail to acknowledge that the overall decision-making system is composed of two entities: the learned model and a human who fuses together model outputs with his or her own information.

BIG-bench Machine Learning Decision Making

Cannot find the paper you are looking for? You can Submit a new open access paper.