Search Results for author: Shenghao Xie

Found 10 papers, 2 papers with code

Keeping Representation Similarity in Finetuning for Medical Image Analysis

no code implementations10 Mar 2025 Wenqiang Zu, Shenghao Xie, Hao Chen, Yiming Liang, Lei Ma

Foundation models pretrained on large-scale natural images have been widely used to adapt to medical image analysis through finetuning.

Image Classification Medical Image Analysis +1

A Survey on Image Quality Assessment: Insights, Analysis, and Future Outlook

no code implementations12 Feb 2025 Chengqian Ma, Zhengyi Shi, Zhiqiang Lu, Shenghao Xie, Fei Chao, Yao Sui

Image quality assessment (IQA) represents a pivotal challenge in image-focused technologies, significantly influencing the advancement trajectory of image processing and computer vision.

Image Quality Assessment Survey

Towards Unifying Understanding and Generation in the Era of Vision Foundation Models: A Survey from the Autoregression Perspective

no code implementations29 Oct 2024 Shenghao Xie, Wenqiang Zu, Mingyang Zhao, Duo Su, Shilong Liu, Ruohua Shi, Guoqi Li, Shanghang Zhang, Lei Ma

First, we present the trend for next generation of vision foundation models, i. e., unifying both understanding and generation in vision tasks.

Survey

Embedded Prompt Tuning: Towards Enhanced Calibration of Pretrained Models for Medical Images

1 code implementation1 Jul 2024 Wenqiang Zu, Shenghao Xie, Qing Zhao, Guoqi Li, Lei Ma

In this work, we facilitate the study of the performance of PEFT when adapting foundation models to medical image classification tasks.

Cross-Domain Few-Shot Image Classification +3

FM-OV3D: Foundation Model-based Cross-modal Knowledge Blending for Open-Vocabulary 3D Detection

no code implementations22 Dec 2023 Dongmei Zhang, Chang Li, Ray Zhang, Shenghao Xie, Wei Xue, Xiaodong Xie, Shanghang Zhang

In this work, we propose FM-OV3D, a method of Foundation Model-based Cross-modal Knowledge Blending for Open-Vocabulary 3D Detection, which improves the open-vocabulary localization and recognition abilities of 3D model by blending knowledge from multiple pre-trained foundation models, achieving true open-vocabulary without facing constraints from original 3D datasets.

3D Object Detection 3D Open-Vocabulary Object Detection +2

Unmasking Transformers: A Theoretical Approach to Data Recovery via Attention Weights

no code implementations19 Oct 2023 Yichuan Deng, Zhao Song, Shenghao Xie, Chiwun Yang

In the realm of deep learning, transformers have emerged as a dominant architecture, particularly in natural language processing tasks.

PatchBackdoor: Backdoor Attack against Deep Neural Networks without Model Modification

1 code implementation22 Aug 2023 Yizhen Yuan, Rui Kong, Shenghao Xie, Yuanchun Li, Yunxin Liu

However, most backdoor attacks have to modify the neural network models through training with poisoned data and/or direct model editing, which leads to a common but false belief that backdoor attack can be easily avoided by properly protecting the model.

Backdoor Attack Real-World Adversarial Attack

Convergence of Two-Layer Regression with Nonlinear Units

no code implementations16 Aug 2023 Yichuan Deng, Zhao Song, Shenghao Xie

Softmax unit and ReLU unit are the key structure in attention computation.

regression

In-Context Learning for Attention Scheme: from Single Softmax Regression to Multiple Softmax Regression via a Tensor Trick

no code implementations5 Jul 2023 Yeqi Gao, Zhao Song, Shenghao Xie

Given matrices $A_1 \in \mathbb{R}^{n \times d}$, and $A_2 \in \mathbb{R}^{n \times d}$ and $B \in \mathbb{R}^{n \times n}$, the purpose is to solve some certain optimization problems: Normalized version $\min_{X} \| D(X)^{-1} \exp(A_1 X A_2^\top) - B \|_F^2$ and Rescaled version $\| \exp(A_1 X A_2^\top) - D(X) \cdot B \|_F^2$.

In-Context Learning Natural Language Understanding +1

Cannot find the paper you are looking for? You can Submit a new open access paper.