Search Results for author: Yehui Tang

Found 51 papers, 40 papers with code

Star-Agents: Automatic Data Optimization with LLM Agents for Instruction Tuning

1 code implementation21 Nov 2024 Hang Zhou, Yehui Tang, Haochen Qin, Yujie Yang, Renren Jin, Deyi Xiong, Kai Han, Yunhe Wang

Our empirical studies, including instruction tuning experiments with models such as Pythia and LLaMA, demonstrate the effectiveness of the proposed framework.

MemoryFormer: Minimize Transformer Computation by Removing Fully-Connected Layers

no code implementations20 Nov 2024 Ning Ding, Yehui Tang, Haochen Qin, Zhenli Zhou, Chao Xu, Lin Li, Kai Han, Heng Liao, Yunhe Wang

This is made possible by utilizing an alternative method for feature transformation to replace the linear projection of fully-connected layers.

Free Video-LLM: Prompt-guided Visual Perception for Efficient Training-free Video LLMs

1 code implementation14 Oct 2024 Kai Han, Jianyuan Guo, Yehui Tang, wei he, Enhua Wu, Yunhe Wang

Conversely, training-free approaches offer a more efficient alternative by adapting pre-trained image-LLMs models for video tasks without additional training, but they face inference efficiency bottlenecks due to the large number of visual tokens generated from video frames.

Computational Efficiency Question Answering +2

Token Compensator: Altering Inference Cost of Vision Transformer without Re-Tuning

1 code implementation13 Aug 2024 Shibo Jie, Yehui Tang, Jianyuan Guo, Zhi-Hong Deng, Kai Han, Yunhe Wang

Token compression expedites the training and inference of Vision Transformers (ViTs) by reducing the number of the redundant tokens, e. g., pruning inattentive tokens or merging similar tokens.

Fine-Grained Image Classification

ExCP: Extreme LLM Checkpoint Compression via Weight-Momentum Joint Shrinking

1 code implementation17 Jun 2024 Wenshuo Li, Xinghao Chen, Han Shu, Yehui Tang, Yunhe Wang

For instance, we achieve approximately $70\times$ compression for the Pythia-410M model, with the final performance being as accurate as the original model on various downstream tasks.

Model Optimization Quantization

SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization

3 code implementations19 May 2024 Jialong Guo, Xinghao Chen, Yehui Tang, Yunhe Wang

However, replacing LayerNorm with more efficient BatchNorm in transformer often leads to inferior performance and collapse in training.

Image Classification Language Modelling +2

No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding

3 code implementations14 May 2024 Yingjie Zhai, Wenshuo Li, Yehui Tang, Xinghao Chen, Yunhe Wang

In this paper, we propose to squeeze the time axis of a video sequence into the channel dimension and present a lightweight video recognition network, term as \textit{SqueezeTime}, for mobile video understanding.

Action Detection Video Recognition +1

EMS-SD: Efficient Multi-sample Speculative Decoding for Accelerating Large Language Models

1 code implementation13 May 2024 Yunsheng Ni, Chuanjian Liu, Yehui Tang, Kai Han, Yunhe Wang

Speculative decoding emerges as a pivotal technique for enhancing the inference speed of Large Language Models (LLMs).

Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning

1 code implementation9 May 2024 Shibo Jie, Yehui Tang, Ning Ding, Zhi-Hong Deng, Kai Han, Yunhe Wang

Current solutions for efficiently constructing large vision-language (VL) models follow a two-step paradigm: projecting the output of pre-trained vision encoders to the input space of pre-trained language models as visual prompts; and then transferring the models to downstream VL tasks via end-to-end parameter-efficient fine-tuning (PEFT).

parameter-efficient fine-tuning Visual Prompting

Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting

1 code implementation29 Apr 2024 Fangcheng Liu, Yehui Tang, Zhenhua Liu, Yunsheng Ni, Kai Han, Yunhe Wang

It is noteworthy that the inference latency of the self-draft model may no longer be negligible compared to the large model, necessitating strategies to increase the token acceptance rate while minimizing the drafting steps of the small model.

GhostNetV3: Exploring the Training Strategies for Compact Models

1 code implementation17 Apr 2024 Zhenhua Liu, Zhiwei Hao, Kai Han, Yehui Tang, Yunhe Wang

In this paper, by systematically investigating the impact of different training ingredients, we introduce a strong training strategy for compact models.

Knowledge Distillation object-detection +1

SAM-DiffSR: Structure-Modulated Diffusion Model for Image Super-Resolution

1 code implementation27 Feb 2024 Chengcheng Wang, Zhiwei Hao, Yehui Tang, Jianyuan Guo, Yujie Yang, Kai Han, Yunhe Wang

In this paper, we propose the SAM-DiffSR model, which can utilize the fine-grained structure information from SAM in the process of sampling noise to improve the image quality without additional computational cost during inference.

Image Super-Resolution

DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models

1 code implementation26 Feb 2024 wei he, Kai Han, Yehui Tang, Chengcheng Wang, Yujie Yang, Tianyu Guo, Yunhe Wang

Large language models (LLMs) face a daunting challenge due to the excessive computational and memory requirements of the commonly used Transformer architecture.

Mamba State Space Models

Data-efficient Large Vision Models through Sequential Autoregression

1 code implementation7 Feb 2024 Jianyuan Guo, Zhiwei Hao, Chengcheng Wang, Yehui Tang, Han Wu, Han Hu, Kai Han, Chang Xu

Training general-purpose vision models on purely sequential visual data, eschewing linguistic inputs, has heralded a new frontier in visual understanding.

Rethinking Optimization and Architecture for Tiny Language Models

1 code implementation5 Feb 2024 Yehui Tang, Fangcheng Liu, Yunsheng Ni, Yuchuan Tian, Zheyuan Bai, Yi-Qi Hu, Sichao Liu, Shangling Jui, Kai Han, Yunhe Wang

Several design formulas are empirically proved especially effective for tiny language models, including tokenizer compression, architecture tweaking, parameter inheritance and multiple-round training.

Language Modelling

A Survey on Transformer Compression

no code implementations5 Feb 2024 Yehui Tang, Yunhe Wang, Jianyuan Guo, Zhijun Tu, Kai Han, Hailin Hu, DaCheng Tao

Model compression methods reduce the memory and computational cost of Transformer, which is a necessary step to implement large language/vision models on practical devices.

Knowledge Distillation Mamba +3

Circuit Design and Efficient Simulation of Quantum Inner Product and Empirical Studies of Its Effect on Near-Term Hybrid Quantum-Classic Machine Learning

1 code implementation CVPR 2024 Hao Xiong, Yehui Tang, Xinyu Ye, Junchi Yan

However it remains unclear for the embodiment of the quantum circuits (QC) for QIP let alone a (thorough) evaluation of the QIP circuits especially in a practical context in the NISQ era by applying QIP to ML via hybrid quantum-classic pipelines.

Image Classification Self-Supervised Learning

PanGu-$π$: Enhancing Language Model Architectures via Nonlinearity Compensation

no code implementations27 Dec 2023 Yunhe Wang, Hanting Chen, Yehui Tang, Tianyu Guo, Kai Han, Ying Nie, Xutao Wang, Hailin Hu, Zheyuan Bai, Yun Wang, Fangcheng Liu, Zhicheng Liu, Jianyuan Guo, Sinan Zeng, Yinchen Zhang, Qinghua Xu, Qun Liu, Jun Yao, Chao Xu, DaCheng Tao

We then demonstrate that the proposed approach is significantly effective for enhancing the model nonlinearity through carefully designed ablations; thus, we present a new efficient model architecture for establishing modern, namely, PanGu-$\pi$.

Language Modelling

TinySAM: Pushing the Envelope for Efficient Segment Anything Model

2 code implementations21 Dec 2023 Han Shu, Wenshuo Li, Yehui Tang, Yiman Zhang, Yihao Chen, Houqiang Li, Yunhe Wang, Xinghao Chen

Extensive experiments on various zero-shot transfer tasks demonstrate the significantly advantageous performance of our TinySAM against counterpart methods.

Knowledge Distillation Quantization

CBQ: Cross-Block Quantization for Large Language Models

no code implementations13 Dec 2023 Xin Ding, Xiaoyu Liu, Zhijun Tu, Yun Zhang, Wei Li, Jie Hu, Hanting Chen, Yehui Tang, Zhiwei Xiong, Baoqun Yin, Yunhe Wang

Post-training quantization (PTQ) has played a key role in compressing large language models (LLMs) with ultra-low costs.

Quantization

LightCLIP: Learning Multi-Level Interaction for Lightweight Vision-Language Models

no code implementations1 Dec 2023 Ying Nie, wei he, Kai Han, Yehui Tang, Tianyu Guo, Fanyi Du, Yunhe Wang

Moreover, based on the observation that the accuracy of CLIP model does not increase correspondingly as the parameters of text encoder increase, an extra objective of masked language modeling (MLM) is leveraged for maximizing the potential of the shortened text encoder.

Image Classification Image-text Retrieval +4

One-for-All: Bridge the Gap Between Heterogeneous Architectures in Knowledge Distillation

1 code implementation NeurIPS 2023 Zhiwei Hao, Jianyuan Guo, Kai Han, Yehui Tang, Han Hu, Yunhe Wang, Chang Xu

To tackle the challenge in distilling heterogeneous models, we propose a simple yet effective one-for-all KD framework called OFA-KD, which significantly improves the distillation performance between heterogeneous architectures.

Knowledge Distillation

GPT4Image: Can Large Pre-trained Models Help Vision Models on Perception Tasks?

1 code implementation1 Jun 2023 Ning Ding, Yehui Tang, Zhongqian Fu, Chao Xu, Kai Han, Yunhe Wang

We present a new learning paradigm in which the knowledge extracted from large pre-trained models are utilized to help models like CNN and ViT learn enhanced representations and achieve better performance.

Descriptive Image Classification

Masked Image Modeling with Local Multi-Scale Reconstruction

1 code implementation CVPR 2023 Haoqing Wang, Yehui Tang, Yunhe Wang, Jianyuan Guo, Zhi-Hong Deng, Kai Han

The lower layers are not explicitly guided and the interaction among their patches is only used for calculating new activations.

Representation Learning

Network Expansion for Practical Training Acceleration

1 code implementation CVPR 2023 Ning Ding, Yehui Tang, Kai Han, Chao Xu, Yunhe Wang

Recently, the sizes of deep neural networks and training datasets both increase drastically to pursue better performance in a practical sense.

FastMIM: Expediting Masked Image Modeling Pre-training for Vision

1 code implementation13 Dec 2022 Jianyuan Guo, Kai Han, Han Wu, Yehui Tang, Yunhe Wang, Chang Xu

This paper presents FastMIM, a simple and generic framework for expediting masked image modeling with the following two steps: (i) pre-training vision backbones with low-resolution input images; and (ii) reconstructing Histograms of Oriented Gradients (HOG) feature instead of original RGB values of the input images.

GhostNetV2: Enhance Cheap Operation with Long-Range Attention

11 code implementations23 Nov 2022 Yehui Tang, Kai Han, Jianyuan Guo, Chang Xu, Chao Xu, Yunhe Wang

The convolutional operation can only capture local information in a window region, which prevents performance from being further improved.

Vision GNN: An Image is Worth Graph of Nodes

9 code implementations1 Jun 2022 Kai Han, Yunhe Wang, Jianyuan Guo, Yehui Tang, Enhua Wu

In this paper, we propose to represent the image as a graph structure and introduce a new Vision GNN (ViG) architecture to extract graph-level feature for visual tasks.

Image Classification Object Detection

Source-Free Domain Adaptation via Distribution Estimation

1 code implementation CVPR 2022 Ning Ding, Yixing Xu, Yehui Tang, Chao Xu, Yunhe Wang, DaCheng Tao

Domain Adaptation aims to transfer the knowledge learned from a labeled source domain to an unlabeled target domain whose data distributions are different.

Privacy Preserving Source-Free Domain Adaptation

From Quantum Graph Computing to Quantum Graph Learning: A Survey

no code implementations19 Feb 2022 Yehui Tang, Junchi Yan, Hancock Edwin

Quantum computing (QC) is a new computational paradigm whose foundations relate to quantum physics.

Graph Learning Survey

PyramidTNT: Improved Transformer-in-Transformer Baselines with Pyramid Architecture

1 code implementation4 Jan 2022 Kai Han, Jianyuan Guo, Yehui Tang, Yunhe Wang

We hope this new baseline will be helpful to the further research and application of vision transformer.

An Image Patch is a Wave: Phase-Aware Vision MLP

8 code implementations CVPR 2022 Yehui Tang, Kai Han, Jianyuan Guo, Chang Xu, Yanxi Li, Chao Xu, Yunhe Wang

To dynamically aggregate tokens, we propose to represent each token as a wave function with two parts, amplitude and phase.

Image Classification object-detection +2

Hire-MLP: Vision MLP via Hierarchical Rearrangement

10 code implementations CVPR 2022 Jianyuan Guo, Yehui Tang, Kai Han, Xinghao Chen, Han Wu, Chao Xu, Chang Xu, Yunhe Wang

Previous vision MLPs such as MLP-Mixer and ResMLP accept linearly flattened image patches as input, making them inflexible for different input sizes and hard to capture spatial information.

Image Classification object-detection +2

Homogeneous Architecture Augmentation for Neural Predictor

1 code implementation ICCV 2021 Yuqiao Liu, Yehui Tang, Yanan sun

Specifically, a homogeneous architecture augmentation algorithm is proposed in HAAP to generate sufficient training data taking the use of homogeneous representation.

Neural Architecture Search

CMT: Convolutional Neural Networks Meet Vision Transformers

14 code implementations CVPR 2022 Jianyuan Guo, Kai Han, Han Wu, Yehui Tang, Xinghao Chen, Yunhe Wang, Chang Xu

Vision transformers have been successfully applied to image recognition tasks due to their ability to capture long-range dependencies within an image.

ReNAS: Relativistic Evaluation of Neural Architecture Search

7 code implementations CVPR 2021 Yixing Xu, Yunhe Wang, Kai Han, Yehui Tang, Shangling Jui, Chunjing Xu, Chang Xu

An effective and efficient architecture performance evaluation scheme is essential for the success of Neural Architecture Search (NAS).

Neural Architecture Search

Patch Slimming for Efficient Vision Transformers

no code implementations CVPR 2022 Yehui Tang, Kai Han, Yunhe Wang, Chang Xu, Jianyuan Guo, Chao Xu, DaCheng Tao

We first identify the effective patches in the last layer and then use them to guide the patch selection process of previous layers.

Efficient ViTs

Vision Transformer Pruning

2 code implementations17 Apr 2021 Mingjian Zhu, Yehui Tang, Kai Han

Vision transformer has achieved competitive performance on a variety of computer vision applications.

Manifold Regularized Dynamic Network Pruning

7 code implementations CVPR 2021 Yehui Tang, Yunhe Wang, Yixing Xu, Yiping Deng, Chao Xu, DaCheng Tao, Chang Xu

Then, the manifold relationship between instances and the pruned sub-networks will be aligned in the training procedure.

Network Pruning

Learning Frequency Domain Approximation for Binary Neural Networks

3 code implementations NeurIPS 2021 Yixing Xu, Kai Han, Chang Xu, Yehui Tang, Chunjing Xu, Yunhe Wang

Binary neural networks (BNNs) represent original full-precision weights and activations into 1-bit with sign function.

A Survey on Visual Transformer

no code implementations23 Dec 2020 Kai Han, Yunhe Wang, Hanting Chen, Xinghao Chen, Jianyuan Guo, Zhenhua Liu, Yehui Tang, An Xiao, Chunjing Xu, Yixing Xu, Zhaohui Yang, Yiman Zhang, DaCheng Tao

Transformer, first applied to the field of natural language processing, is a type of deep neural network mainly based on the self-attention mechanism.

Image Classification Inductive Bias +1

SCOP: Scientific Control for Reliable Neural Network Pruning

4 code implementations NeurIPS 2020 Yehui Tang, Yunhe Wang, Yixing Xu, DaCheng Tao, Chunjing Xu, Chao Xu, Chang Xu

To increase the reliability of the results, we prefer to have a more rigorous research design by including a scientific control group as an essential part to minimize the effect of all factors except the association between the filter and expected network output.

Network Pruning

A Semi-Supervised Assessor of Neural Architectures

no code implementations CVPR 2020 Yehui Tang, Yunhe Wang, Yixing Xu, Hanting Chen, Chunjing Xu, Boxin Shi, Chao Xu, Qi Tian, Chang Xu

A graph convolutional neural network is introduced to predict the performance of architectures based on the learned representations and their relation modeled by the graph.

Neural Architecture Search

Beyond Dropout: Feature Map Distortion to Regularize Deep Neural Networks

2 code implementations23 Feb 2020 Yehui Tang, Yunhe Wang, Yixing Xu, Boxin Shi, Chao Xu, Chunjing Xu, Chang Xu

On one hand, massive trainable parameters significantly enhance the performance of these deep networks.

ReNAS:Relativistic Evaluation of Neural Architecture Search

4 code implementations30 Sep 2019 Yixing Xu, Yunhe Wang, Kai Han, Yehui Tang, Shangling Jui, Chunjing Xu, Chang Xu

An effective and efficient architecture performance evaluation scheme is essential for the success of Neural Architecture Search (NAS).

Neural Architecture Search

Bringing Giant Neural Networks Down to Earth with Unlabeled Data

no code implementations13 Jul 2019 Yehui Tang, Shan You, Chang Xu, Boxin Shi, Chao Xu

Specifically, we exploit the unlabeled data to mimic the classification characteristics of giant networks, so that the original capacity can be preserved nicely.

Cannot find the paper you are looking for? You can Submit a new open access paper.