Search Results for author: Yizeng Han

Found 18 papers, 12 papers with code

Agent Attention: On the Integration of Softmax and Linear Attention

2 code implementations • 14 Dec 2023 • Dongchen Han, Tianzhu Ye, Yizeng Han, Zhuofan Xia, Shiji Song, Gao Huang

Specifically, the Agent Attention, denoted as a quadruple $(Q, A, K, V)$, introduces an additional set of agent tokens $A$ into the conventional attention module.

Computational Efficiency Image Classification +4

339

Paper
Code

FLatten Transformer: Vision Transformer using Focused Linear Attention

1 code implementation • ICCV 2023 • Dongchen Han, Xuran Pan, Yizeng Han, Shiji Song, Gao Huang

The quadratic computation complexity of self-attention has been a persistent challenge when applying Transformer models to vision tasks.

332

Paper
Code

Resolution Adaptive Networks for Efficient Inference

2 code implementations • CVPR 2020 • Le Yang, Yizeng Han, Xi Chen, Shiji Song, Jifeng Dai, Gao Huang

Adaptive inference is an effective mechanism to achieve a dynamic tradeoff between accuracy and computational cost in deep networks.

143

Paper
Code

Adaptive Focus for Efficient Video Recognition

1 code implementation • ICCV 2021 • Yulin Wang, Zhaoxi Chen, Haojun Jiang, Shiji Song, Yizeng Han, Gao Huang

In this paper, we explore the spatial redundancy in video recognition with the aim to improve the computational efficiency.

Computational Efficiency Video Recognition

120

Paper
Code

Adaptive Rotated Convolution for Rotated Object Detection

1 code implementation • ICCV 2023 • Yifan Pu, Yiru Wang, Zhuofan Xia, Yizeng Han, Yulin Wang, Weihao Gan, Zidong Wang, Shiji Song, Gao Huang

In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images, and an efficient conditional computation mechanism is introduced to accommodate the large orientation variations of objects within an image.

Ranked #3 on Object Detection In Aerial Images on DOTA (using extra training data)

Object object-detection +2

Paper
Code

Dynamic Perceiver for Efficient Visual Recognition

1 code implementation • ICCV 2023 • Yizeng Han, Dongchen Han, Zeyu Liu, Yulin Wang, Xuran Pan, Yifan Pu, Chao Deng, Junlan Feng, Shiji Song, Gao Huang

Early exits are placed exclusively within the classification branch, thus eliminating the need for linear separability in low-level features.

Action Recognition Classification +4

Paper
Code

Latency-aware Spatial-wise Dynamic Networks

2 code implementations • 12 Oct 2022 • Yizeng Han, Zhihang Yuan, Yifan Pu, Chenhao Xue, Shiji Song, Guangyu Sun, Gao Huang

The latency prediction model can efficiently estimate the inference latency of dynamic networks by simultaneously considering algorithms, scheduling strategies, and hardware properties.

Image Classification Instance Segmentation +4

Paper
Code

Latency-aware Unified Dynamic Networks for Efficient Image Recognition

1 code implementation • 30 Aug 2023 • Yizeng Han, Zeyu Liu, Zhihang Yuan, Yifan Pu, Chaofei Wang, Shiji Song, Gao Huang

Dynamic computation has emerged as a promising avenue to enhance the inference efficiency of deep networks.

Scheduling

Paper
Code

Learning to Weight Samples for Dynamic Early-exiting Networks

1 code implementation • 17 Sep 2022 • Yizeng Han, Yifan Pu, Zihang Lai, Chaofei Wang, Shiji Song, Junfen Cao, Wenhui Huang, Chao Deng, Gao Huang

Intuitively, easy samples, which generally exit early in the network during inference, should contribute more to training early classifiers.

Meta-Learning

Paper
Code

Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation

1 code implementation • 18 Mar 2024 • Wangbo Zhao, Jiasheng Tang, Yizeng Han, Yibing Song, Kai Wang, Gao Huang, Fan Wang, Yang You

Existing parameter-efficient fine-tuning (PEFT) methods have achieved significant success on vision transformers (ViTs) adaptation by improving parameter efficiency.

Semantic Segmentation Video Recognition

Paper
Code

Fine-grained Recognition with Learnable Semantic Data Augmentation

1 code implementation • 1 Sep 2023 • Yifan Pu, Yizeng Han, Yulin Wang, Junlan Feng, Chao Deng, Gao Huang

Since images belonging to the same meta-category usually share similar visual appearances, mining discriminative visual cues is the key to distinguishing fine-grained categories.

Data Augmentation Fine-Grained Image Recognition +2

Paper
Code

SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning

1 code implementation • 21 Feb 2024 • Chaoqun Du, Yizeng Han, Gao Huang

Recent advancements in semi-supervised learning have focused on a more realistic yet challenging task: addressing imbalances in labeled data while the class distribution of unlabeled data remains both unknown and potentially mismatched.

Paper
Code

Dynamic Neural Networks: A Survey

no code implementations • 9 Feb 2021 • Yizeng Han, Gao Huang, Shiji Song, Le Yang, Honghui Wang, Yulin Wang

Dynamic neural network is an emerging research topic in deep learning.

Computational Efficiency Decision Making

Paper
Add Code

CAM-loss: Towards Learning Spatially Discriminative Feature Representations

no code implementations • ICCV 2021 • Chaofei Wang, Jiayu Xiao, Yizeng Han, Qisen Yang, Shiji Song, Gao Huang

The backbone of traditional CNN classifier is generally considered as a feature extractor, followed by a linear layer which performs the classification.

Few-Shot Learning Image Classification +2

Paper
Add Code

Computation-efficient Deep Learning for Computer Vision: A Survey

no code implementations • 27 Aug 2023 • Yulin Wang, Yizeng Han, Chaofei Wang, Shiji Song, Qi Tian, Gao Huang

Over the past decade, deep learning models have exhibited considerable advancements, reaching or even exceeding human-level performance in a range of visual perception tasks.

Autonomous Vehicles Edge-computing +1

Paper
Add Code

GSVA: Generalized Segmentation via Multimodal Large Language Models

no code implementations • 15 Dec 2023 • Zhuofan Xia, Dongchen Han, Yizeng Han, Xuran Pan, Shiji Song, Gao Huang

Generalized Referring Expression Segmentation (GRES) extends the scope of classic RES to refer to multiple objects in one expression or identify the empty targets absent in the image.

Generalized Referring Expression Segmentation Referring Expression +1

Paper
Add Code

Mask Grounding for Referring Image Segmentation

no code implementations • 19 Dec 2023 • Yong Xien Chng, Henry Zheng, Yizeng Han, Xuchong Qiu, Gao Huang

To tackle this challenge, we introduce a novel Mask Grounding auxiliary task that significantly improves visual grounding within language features, by explicitly teaching the model to learn fine-grained correspondence between masked textual tokens and their matching visual objects.

Ranked #2 on Referring Expression Segmentation on RefCOCO testB

Image Segmentation Referring Expression Segmentation +4

Paper
Add Code

GRA: Detecting Oriented Objects through Group-wise Rotating and Attention

no code implementations • 17 Mar 2024 • Jiangshan Wang, Yifan Pu, Yizeng Han, Jiayi Guo, Yiru Wang, Xiu Li, Gao Huang

GRA can adaptively capture fine-grained features of objects with diverse orientations, comprising two key components: Group-wise Rotating and Group-wise Attention.

Object object-detection +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.