Search Results for author: Huihong Shi

Found 9 papers, 6 papers with code

An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT

no code implementations • 29 Mar 2024 • Haikuo Shao, Huihong Shi, Wendong Mao, Zhongfeng Wang

Vision Transformers (ViTs) have achieved significant success in computer vision.

Efficient ViTs

Paper
Add Code

A Computationally Efficient Neural Video Compression Accelerator Based on a Sparse CNN-Transformer Hybrid Network

no code implementations • 17 Dec 2023 • Siyu Zhang, Wendong Mao, Huihong Shi, Zhongfeng Wang

Video compression is widely used in digital television, surveillance systems, and virtual reality.

Video Compression

Paper
Add Code

S2R: Exploring a Double-Win Transformer-Based Framework for Ideal and Blind Super-Resolution

1 code implementation • 16 Aug 2023 • Minghao She, Wendong Mao, Huihong Shi, Zhongfeng Wang

In this paper, we propose a double-win framework for ideal and blind SR task, named S2R, including a light-weight transformer-based SR model (S2R transformer) and a novel coarse-to-fine training strategy, which can achieve excellent visual results on both ideal and random fuzzy conditions.

Blind Super-Resolution Super-Resolution +1

Paper
Code

ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer

1 code implementation • NeurIPS 2023 • Haoran You, Huihong Shi, Yipin Guo, Yingyan Lin

To marry the best of both worlds, we further propose a new mixture of experts (MoE) framework to reparameterize MLPs by taking multiplication or its primitives as experts, e. g., multiplication and shift, and designing a new latency-aware load-balancing loss.

Efficient ViTs

Paper
Code

ViTALiTy: Unifying Low-rank and Sparse Approximation for Vision Transformer Acceleration with a Linear Taylor Attention

1 code implementation • 9 Nov 2022 • Jyotikrishna Dass, Shang Wu, Huihong Shi, Chaojian Li, Zhifan Ye, Zhongfeng Wang, Yingyan Lin

Unlike sparsity-based Transformer accelerators for NLP, ViTALiTy unifies both low-rank and sparse components of the attention in ViTs.

Paper
Code

NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks

2 code implementations • 24 Oct 2022 • Huihong Shi, Haoran You, Yang Zhao, Zhongfeng Wang, Yingyan Lin

Multiplication is arguably the most cost-dominant operation in modern deep neural networks (DNNs), limiting their achievable efficiency and thus more extensive deployment in resource-constrained applications.

Neural Architecture Search

Paper
Code

ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design

1 code implementation • 18 Oct 2022 • Haoran You, Zhanyi Sun, Huihong Shi, Zhongzhi Yu, Yang Zhao, Yongan Zhang, Chaojian Li, Baopu Li, Yingyan Lin

Specifically, on the algorithm level, ViTCoD prunes and polarizes the attention maps to have either denser or sparser fixed patterns for regularizing two levels of workloads without hurting the accuracy, largely reducing the attention computations while leaving room for alleviating the remaining dominant data movements; on top of that, we further integrate a lightweight and learnable auto-encoder module to enable trading the dominant high-cost data movements for lower-cost computations.

Paper
Code

ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks

1 code implementation • 17 May 2022 • Haoran You, Baopu Li, Huihong Shi, Yonggan Fu, Yingyan Lin

To this end, this work advocates hybrid NNs that consist of both powerful yet costly multiplications and efficient yet less powerful operators for marrying the best of both worlds, and proposes ShiftAddNAS, which can automatically search for more accurate and more efficient NNs.

Paper
Code

Max-Affine Spline Insights Into Deep Network Pruning

no code implementations • 7 Jan 2021 • Haoran You, Randall Balestriero, Zhihan Lu, Yutong Kou, Huihong Shi, Shunyao Zhang, Shang Wu, Yingyan Lin, Richard Baraniuk

In this paper, we study the importance of pruning in Deep Networks (DNs) and the yin & yang relationship between (1) pruning highly overparametrized DNs that have been trained from random initialization and (2) training small DNs that have been "cleverly" initialized.

Network Pruning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.