Search Results for author: Shuyan Li

Found 10 papers, 3 papers with code

AV-GAN: Attention-Based Varifocal Generative Adversarial Network for Uneven Medical Image Translation

no code implementations16 Apr 2024 Zexin Li, Yiyang Lin, Zijie Fang, Shuyan Li, Xiu Li

In this paper, we propose the Attention-Based Varifocal Generative Adversarial Network (AV-GAN), which solves multiple problems in pathologic image translation tasks, such as uneven translation difficulty in different regions, mutual interference of multiple resolution information, and nuclear deformation.

Generative Adversarial Network Translation

Efficient Prompt Tuning of Large Vision-Language Model for Fine-Grained Ship Classification

no code implementations13 Mar 2024 Long Lan, Fengxiang Wang, Shuyan Li, Xiangtao Zheng, Zengmao Wang, Xinwang Liu

Directly fine-tuning VLMs for RS-FGSC often encounters the challenge of overfitting the seen classes, resulting in suboptimal generalization to unseen classes, which highlights the difficulty in differentiating complex backgrounds and capturing distinct ship features.

Language Modelling Zero-Shot Learning

XIMAGENET-12: An Explainable AI Benchmark Dataset for Model Robustness Evaluation

no code implementations12 Oct 2023 Qiang Li, Dan Zhang, Shengzhao Lei, Xun Zhao, Porawit Kamnoedboon, Weiwei Li, Junhao Dong, Shuyan Li

Despite the promising performance of existing visual models on public benchmarks, the critical assessment of their robustness for real-world applications remains an ongoing challenge.

Classification

SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation

1 code implementation NeurIPS 2023 Zhuoyan Luo, Yicheng Xiao, Yong liu, Shuyan Li, Yitong Wang, Yansong Tang, Xiu Li, Yujiu Yang

To address this issue, we propose Semantic-assisted Object Cluster (SOC), which aggregates video content and textual guidance for unified temporal modeling and cross-modal alignment.

Ranked #2 on Referring Expression Segmentation on A2D Sentences (using extra training data)

Object Referring Expression Segmentation +4

Towards Realizing the Value of Labeled Target Samples: a Two-Stage Approach for Semi-Supervised Domain Adaptation

no code implementations21 Apr 2023 mengqun Jin, Kai Li, Shuyan Li, Chunming He, Xiu Li

We further propose a consistency learning based mean teacher model to effectively adapt the learned UDA model using labeled and unlabeled target samples.

Semi-supervised Domain Adaptation Unsupervised Domain Adaptation

SSGD: A smartphone screen glass dataset for defect detection

1 code implementation12 Mar 2023 Haonan Han, Rui Yang, Shuyan Li, Runze Hu, Xiu Li

Interactive devices with touch screen have become commonly used in various aspects of daily life, which raises the demand for high production quality of touch screen glass.

Defect Detection object-detection +1

SemanticAC: Semantics-Assisted Framework for Audio Classification

no code implementations12 Feb 2023 Yicheng Xiao, Yue Ma, Shuyan Li, Hantao Zhou, Ran Liao, Xiu Li

In this paper, we propose SemanticAC, a semantics-assisted framework for Audio Classification to better leverage the semantic information.

Audio Classification Language Modelling

Adversarial Alignment for Source Free Object Detection

no code implementations11 Jan 2023 Qiaosong Chu, Shuyan Li, Guangyi Chen, Kai Li, Xiu Li

Source-free object detection (SFOD) aims to transfer a detector pre-trained on a label-rich source domain to an unlabeled target domain without seeing source data.

Object object-detection +1

Self-Supervised Video Hashing via Bidirectional Transformers

1 code implementation CVPR 2021 Shuyan Li, Xiu Li, Jiwen Lu, Jie zhou

Most existing unsupervised video hashing methods are built on unidirectional models with less reliable training objectives, which underuse the correlations among frames and the similarity structure between videos.

Retrieval Video Retrieval

Neighborhood Preserving Hashing for Scalable Video Retrieval

no code implementations ICCV 2019 Shuyan Li, Zhixiang Chen, Jiwen Lu, Xiu Li, Jie Zhou

We then integrate the neighborhood attention mechanism into an RNN-based reconstruction scheme to encourage the binary codes to capture the spatial-temporal structure in a video which is consistent with that in the neighborhood.

Retrieval Video Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.