Search Results for author: Yizeng Han

Found 18 papers, 12 papers with code

Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation

1 code implementation18 Mar 2024 Wangbo Zhao, Jiasheng Tang, Yizeng Han, Yibing Song, Kai Wang, Gao Huang, Fan Wang, Yang You

Existing parameter-efficient fine-tuning (PEFT) methods have achieved significant success on vision transformers (ViTs) adaptation by improving parameter efficiency.

Semantic Segmentation Video Recognition

GRA: Detecting Oriented Objects through Group-wise Rotating and Attention

no code implementations17 Mar 2024 Jiangshan Wang, Yifan Pu, Yizeng Han, Jiayi Guo, Yiru Wang, Xiu Li, Gao Huang

GRA can adaptively capture fine-grained features of objects with diverse orientations, comprising two key components: Group-wise Rotating and Group-wise Attention.

Object object-detection +2

SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning

1 code implementation21 Feb 2024 Chaoqun Du, Yizeng Han, Gao Huang

Recent advancements in semi-supervised learning have focused on a more realistic yet challenging task: addressing imbalances in labeled data while the class distribution of unlabeled data remains both unknown and potentially mismatched.

Mask Grounding for Referring Image Segmentation

no code implementations19 Dec 2023 Yong Xien Chng, Henry Zheng, Yizeng Han, Xuchong Qiu, Gao Huang

To tackle this challenge, we introduce a novel Mask Grounding auxiliary task that significantly improves visual grounding within language features, by explicitly teaching the model to learn fine-grained correspondence between masked textual tokens and their matching visual objects.

Image Segmentation Referring Expression Segmentation +4

GSVA: Generalized Segmentation via Multimodal Large Language Models

no code implementations15 Dec 2023 Zhuofan Xia, Dongchen Han, Yizeng Han, Xuran Pan, Shiji Song, Gao Huang

Generalized Referring Expression Segmentation (GRES) extends the scope of classic RES to refer to multiple objects in one expression or identify the empty targets absent in the image.

Generalized Referring Expression Segmentation Referring Expression +1

Agent Attention: On the Integration of Softmax and Linear Attention

2 code implementations14 Dec 2023 Dongchen Han, Tianzhu Ye, Yizeng Han, Zhuofan Xia, Shiji Song, Gao Huang

Specifically, the Agent Attention, denoted as a quadruple $(Q, A, K, V)$, introduces an additional set of agent tokens $A$ into the conventional attention module.

Computational Efficiency Image Classification +4

Fine-grained Recognition with Learnable Semantic Data Augmentation

1 code implementation1 Sep 2023 Yifan Pu, Yizeng Han, Yulin Wang, Junlan Feng, Chao Deng, Gao Huang

Since images belonging to the same meta-category usually share similar visual appearances, mining discriminative visual cues is the key to distinguishing fine-grained categories.

Data Augmentation Fine-Grained Image Recognition +2

Latency-aware Unified Dynamic Networks for Efficient Image Recognition

1 code implementation30 Aug 2023 Yizeng Han, Zeyu Liu, Zhihang Yuan, Yifan Pu, Chaofei Wang, Shiji Song, Gao Huang

Dynamic computation has emerged as a promising avenue to enhance the inference efficiency of deep networks.


Computation-efficient Deep Learning for Computer Vision: A Survey

no code implementations27 Aug 2023 Yulin Wang, Yizeng Han, Chaofei Wang, Shiji Song, Qi Tian, Gao Huang

Over the past decade, deep learning models have exhibited considerable advancements, reaching or even exceeding human-level performance in a range of visual perception tasks.

Autonomous Vehicles Edge-computing +1

FLatten Transformer: Vision Transformer using Focused Linear Attention

1 code implementation ICCV 2023 Dongchen Han, Xuran Pan, Yizeng Han, Shiji Song, Gao Huang

The quadratic computation complexity of self-attention has been a persistent challenge when applying Transformer models to vision tasks.

Dynamic Perceiver for Efficient Visual Recognition

1 code implementation ICCV 2023 Yizeng Han, Dongchen Han, Zeyu Liu, Yulin Wang, Xuran Pan, Yifan Pu, Chao Deng, Junlan Feng, Shiji Song, Gao Huang

Early exits are placed exclusively within the classification branch, thus eliminating the need for linear separability in low-level features.

Action Recognition Classification +4

Adaptive Rotated Convolution for Rotated Object Detection

1 code implementation ICCV 2023 Yifan Pu, Yiru Wang, Zhuofan Xia, Yizeng Han, Yulin Wang, Weihao Gan, Zidong Wang, Shiji Song, Gao Huang

In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images, and an efficient conditional computation mechanism is introduced to accommodate the large orientation variations of objects within an image.

Ranked #3 on Object Detection In Aerial Images on DOTA (using extra training data)

Object object-detection +2

Latency-aware Spatial-wise Dynamic Networks

2 code implementations12 Oct 2022 Yizeng Han, Zhihang Yuan, Yifan Pu, Chenhao Xue, Shiji Song, Guangyu Sun, Gao Huang

The latency prediction model can efficiently estimate the inference latency of dynamic networks by simultaneously considering algorithms, scheduling strategies, and hardware properties.

Image Classification Instance Segmentation +4

Learning to Weight Samples for Dynamic Early-exiting Networks

1 code implementation17 Sep 2022 Yizeng Han, Yifan Pu, Zihang Lai, Chaofei Wang, Shiji Song, Junfen Cao, Wenhui Huang, Chao Deng, Gao Huang

Intuitively, easy samples, which generally exit early in the network during inference, should contribute more to training early classifiers.


CAM-loss: Towards Learning Spatially Discriminative Feature Representations

no code implementations ICCV 2021 Chaofei Wang, Jiayu Xiao, Yizeng Han, Qisen Yang, Shiji Song, Gao Huang

The backbone of traditional CNN classifier is generally considered as a feature extractor, followed by a linear layer which performs the classification.

Few-Shot Learning Image Classification +2

Adaptive Focus for Efficient Video Recognition

1 code implementation ICCV 2021 Yulin Wang, Zhaoxi Chen, Haojun Jiang, Shiji Song, Yizeng Han, Gao Huang

In this paper, we explore the spatial redundancy in video recognition with the aim to improve the computational efficiency.

Computational Efficiency Video Recognition

Resolution Adaptive Networks for Efficient Inference

2 code implementations CVPR 2020 Le Yang, Yizeng Han, Xi Chen, Shiji Song, Jifeng Dai, Gao Huang

Adaptive inference is an effective mechanism to achieve a dynamic tradeoff between accuracy and computational cost in deep networks.

Cannot find the paper you are looking for? You can Submit a new open access paper.