no code implementations • 29 Mar 2024 • Haikuo Shao, Huihong Shi, Wendong Mao, Zhongfeng Wang
Vision Transformers (ViTs) have achieved significant success in computer vision.
no code implementations • 17 Dec 2023 • Siyu Zhang, Wendong Mao, Huihong Shi, Zhongfeng Wang
Video compression is widely used in digital television, surveillance systems, and virtual reality.
1 code implementation • 16 Aug 2023 • Minghao She, Wendong Mao, Huihong Shi, Zhongfeng Wang
In this paper, we propose a double-win framework for ideal and blind SR task, named S2R, including a light-weight transformer-based SR model (S2R transformer) and a novel coarse-to-fine training strategy, which can achieve excellent visual results on both ideal and random fuzzy conditions.
1 code implementation • NeurIPS 2023 • Haoran You, Huihong Shi, Yipin Guo, Yingyan Lin
To marry the best of both worlds, we further propose a new mixture of experts (MoE) framework to reparameterize MLPs by taking multiplication or its primitives as experts, e. g., multiplication and shift, and designing a new latency-aware load-balancing loss.
1 code implementation • 9 Nov 2022 • Jyotikrishna Dass, Shang Wu, Huihong Shi, Chaojian Li, Zhifan Ye, Zhongfeng Wang, Yingyan Lin
Unlike sparsity-based Transformer accelerators for NLP, ViTALiTy unifies both low-rank and sparse components of the attention in ViTs.
2 code implementations • 24 Oct 2022 • Huihong Shi, Haoran You, Yang Zhao, Zhongfeng Wang, Yingyan Lin
Multiplication is arguably the most cost-dominant operation in modern deep neural networks (DNNs), limiting their achievable efficiency and thus more extensive deployment in resource-constrained applications.
1 code implementation • 18 Oct 2022 • Haoran You, Zhanyi Sun, Huihong Shi, Zhongzhi Yu, Yang Zhao, Yongan Zhang, Chaojian Li, Baopu Li, Yingyan Lin
Specifically, on the algorithm level, ViTCoD prunes and polarizes the attention maps to have either denser or sparser fixed patterns for regularizing two levels of workloads without hurting the accuracy, largely reducing the attention computations while leaving room for alleviating the remaining dominant data movements; on top of that, we further integrate a lightweight and learnable auto-encoder module to enable trading the dominant high-cost data movements for lower-cost computations.
1 code implementation • 17 May 2022 • Haoran You, Baopu Li, Huihong Shi, Yonggan Fu, Yingyan Lin
To this end, this work advocates hybrid NNs that consist of both powerful yet costly multiplications and efficient yet less powerful operators for marrying the best of both worlds, and proposes ShiftAddNAS, which can automatically search for more accurate and more efficient NNs.
no code implementations • 7 Jan 2021 • Haoran You, Randall Balestriero, Zhihan Lu, Yutong Kou, Huihong Shi, Shunyao Zhang, Shang Wu, Yingyan Lin, Richard Baraniuk
In this paper, we study the importance of pruning in Deep Networks (DNs) and the yin & yang relationship between (1) pruning highly overparametrized DNs that have been trained from random initialization and (2) training small DNs that have been "cleverly" initialized.