1 code implementation • ECCV 2020 • Chaojian Li, Tianlong Chen, Haoran You, Zhangyang Wang, Yingyan Lin
There has been an explosive demand for bringing machine learning (ML) powered intelligence into numerous Internet-of-Things (IoT) devices.
no code implementations • 2 Jan 2024 • Zishen Wan, Che-Kai Liu, Hanchen Yang, Chaojian Li, Haoran You, Yonggan Fu, Cheng Wan, Tushar Krishna, Yingyan Lin, Arijit Raychowdhury
The remarkable advancements in artificial intelligence (AI), primarily driven by deep neural networks, have significantly impacted various aspects of our lives.
no code implementations • 24 Oct 2023 • Shunyao Zhang, Yonggan Fu, Shang Wu, Jyotikrishna Dass, Haoran You, Yingyan, Lin
To this end, we propose a framework called NetDistiller to boost the achievable accuracy of TNNs by treating them as sub-networks of a weight-sharing teacher constructed by expanding the number of channels of the TNN.
no code implementations • 23 Jun 2023 • Zhongzhi Yu, Yonggan Fu, Jiayi Yuan, Haoran You, Yingyan Lin
Tiny deep learning has attracted increasing attention driven by the substantial demand for deploying deep learning on numerous intelligent Internet-of-Things devices.
1 code implementation • NeurIPS 2023 • Haoran You, Huihong Shi, Yipin Guo, Yingyan Lin
To marry the best of both worlds, we further propose a new mixture of experts (MoE) framework to reparameterize MLPs by taking multiplication or its primitives as experts, e. g., multiplication and shift, and designing a new latency-aware load-balancing loss.
no code implementations • 24 Apr 2023 • Yonggan Fu, Zhifan Ye, Jiayi Yuan, Shunyao Zhang, Sixu Li, Haoran You, Yingyan Lin
Novel view synthesis is an essential functionality for enabling immersive experiences in various Augmented- and Virtual-Reality (AR/VR) applications, for which generalizable Neural Radiance Fields (NeRFs) have gained increasing popularity thanks to their cross-scene generalization capability.
1 code implementation • CVPR 2023 • Haoran You, Yunyang Xiong, Xiaoliang Dai, Bichen Wu, Peizhao Zhang, Haoqi Fan, Peter Vajda, Yingyan Lin
Vision Transformers (ViTs) have shown impressive performance but still require a high computation cost as compared to convolutional neural networks (CNNs), one reason is that ViTs' attention measures global similarities and thus has a quadratic complexity with the number of input tokens.
2 code implementations • 24 Oct 2022 • Huihong Shi, Haoran You, Yang Zhao, Zhongfeng Wang, Yingyan Lin
Multiplication is arguably the most cost-dominant operation in modern deep neural networks (DNNs), limiting their achievable efficiency and thus more extensive deployment in resource-constrained applications.
1 code implementation • 18 Oct 2022 • Haoran You, Zhanyi Sun, Huihong Shi, Zhongzhi Yu, Yang Zhao, Yongan Zhang, Chaojian Li, Baopu Li, Yingyan Lin
Specifically, on the algorithm level, ViTCoD prunes and polarizes the attention maps to have either denser or sparser fixed patterns for regularizing two levels of workloads without hurting the accuracy, largely reducing the attention computations while leaving room for alleviating the remaining dominant data movements; on top of that, we further integrate a lightweight and learnable auto-encoder module to enable trading the dominant high-cost data movements for lower-cost computations.
1 code implementation • 8 Jul 2022 • Haoran You, Baopu Li, Zhanyi Sun, Xu Ouyang, Yingyan Lin
In this paper, we discover for the first time that both efficient DNNs and their lottery subnetworks (i. e., lottery tickets) can be directly identified from a supernet, which we term as SuperTickets, via a two-in-one training scheme with jointly architecture searching and parameter pruning.
1 code implementation • 17 May 2022 • Haoran You, Baopu Li, Huihong Shi, Yonggan Fu, Yingyan Lin
To this end, this work advocates hybrid NNs that consist of both powerful yet costly multiplications and efficient yet less powerful operators for marrying the best of both worlds, and proposes ShiftAddNAS, which can automatically search for more accurate and more efficient NNs.
no code implementations • 15 Mar 2022 • Zhongzhi Yu, Yonggan Fu, Shang Wu, Mengquan Li, Haoran You, Yingyan Lin
While existing works mostly fix the model precision during the whole training process, a few pioneering works have shown that dynamic precision schedules help DNNs converge to a better accuracy while leading to a lower training cost than their static precision training counterparts.
no code implementations • 7 Mar 2022 • Tong Geng, Chunshu Wu, Yongan Zhang, Cheng Tan, Chenhao Xie, Haoran You, Martin C. Herbordt, Yingyan Lin, Ang Li
In this paper we propose a novel hardware accelerator for GCN inference, called I-GCN, that significantly improves data locality and reduces unnecessary computation.
1 code implementation • 22 Dec 2021 • Haoran You, Tong Geng, Yongan Zhang, Ang Li, Yingyan Lin
This is because real-world graphs can be extremely large and sparse.
no code implementations • 29 Sep 2021 • Chaojian Li, Xu Ouyang, Yang Zhao, Haoran You, Yonggan Fu, Yuchen Gu, Haonan Liu, Siyuan Miao, Yingyan Lin
Graph Convolutional Networks (GCNs) have gained an increasing attention thanks to their state-of-the-art (SOTA) performance in graph-based learning tasks.
no code implementations • 18 Sep 2021 • Yongan Zhang, Haoran You, Yonggan Fu, Tong Geng, Ang Li, Yingyan Lin
While end-to-end jointly optimizing GNNs and their accelerators is promising in boosting GNNs' inference efficiency and expediting the design process, it is still underexplored due to the vast and distinct design spaces of GNNs and their accelerators.
1 code implementation • 19 Mar 2021 • Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Yingyan Lin
To design HW-NAS-Bench, we carefully collected the measured/estimated hardware performance of all the networks in the search spaces of both NAS-Bench-201 and FBNet, on six hardware devices that fall into three categories (i. e., commercial edge devices, FPGA, and ASIC).
Hardware Aware Neural Architecture Search Neural Architecture Search
2 code implementations • 1 Mar 2021 • Haoran You, Zhihan Lu, Zijian Zhou, Yonggan Fu, Yingyan Lin
Experiments on various GCN models and datasets consistently validate our GEB finding and the effectiveness of our GEBT, e. g., our GEBT achieves up to 80. 2% ~ 85. 6% and 84. 6% ~ 87. 5% savings of GCN training and inference costs while offering a comparable or even better accuracy as compared to state-of-the-art methods.
no code implementations • 7 Jan 2021 • Haoran You, Randall Balestriero, Zhihan Lu, Yutong Kou, Huihong Shi, Shunyao Zhang, Shang Wu, Yingyan Lin, Richard Baraniuk
In this paper, we study the importance of pruning in Deep Networks (DNs) and the yin & yang relationship between (1) pruning highly overparametrized DNs that have been trained from random initialization and (2) training small DNs that have been "cleverly" initialized.
1 code implementation • 4 Jan 2021 • Xiaohan Chen, Yang Zhao, Yue Wang, Pengfei Xu, Haoran You, Chaojian Li, Yonggan Fu, Yingyan Lin, Zhangyang Wang
Results show that: 1) applied to inference, SD achieves up to 2. 44x energy efficiency as evaluated via real hardware implementations; 2) applied to training, SD leads to 10. 56x and 4. 48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.
no code implementations • ICLR 2021 • Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Cong Hao, Yingyan Lin
To design HW-NAS-Bench, we carefully collected the measured/estimated hardware performance (e. g., energy cost and latency) of all the networks in the search space of both NAS-Bench-201 and FBNet, considering six hardware devices that fall into three categories (i. e., commercial edge devices, FPGA, and ASIC).
Hardware Aware Neural Architecture Search Neural Architecture Search
no code implementations • 1 Jan 2021 • Yonggan Fu, Yongan Zhang, Haoran You, Yingyan Lin
First, to jointly search for a network and its precision via differentiable search, there exists a dilemma of whether to explode the memory consumption or achieve sub-optimal designs.
1 code implementation • NeurIPS 2020 • Yonggan Fu, Haoran You, Yang Zhao, Yue Wang, Chaojian Li, Kailash Gopalakrishnan, Zhangyang Wang, Yingyan Lin
Recent breakthroughs in deep neural networks (DNNs) have fueled a tremendous demand for intelligent edge devices featuring on-site learning, while the practical realization of such systems remains a challenge due to the limited resources available at the edge and the required massive training costs for state-of-the-art (SOTA) DNNs.
no code implementations • 28 Oct 2020 • Yongan Zhang, Yonggan Fu, Weiwen Jiang, Chaojian Li, Haoran You, Meng Li, Vikas Chandra, Yingyan Lin
Powerful yet complex deep neural networks (DNNs) have fueled a booming demand for efficient DNN solutions to bring DNN-powered intelligence into numerous applications.
1 code implementation • NeurIPS 2020 • Haoran You, Xiaohan Chen, Yongan Zhang, Chaojian Li, Sicheng Li, Zihao Liu, Zhangyang Wang, Yingyan Lin
Multiplication (e. g., convolution) is arguably a cornerstone of modern deep neural networks (DNNs).
no code implementations • 7 May 2020 • Yang Zhao, Xiaohan Chen, Yue Wang, Chaojian Li, Haoran You, Yonggan Fu, Yuan Xie, Zhangyang Wang, Yingyan Lin
We present SmartExchange, an algorithm-hardware co-design framework to trade higher-cost memory storage/access for lower-cost computation, for energy-efficient inference of deep neural networks (DNNs).
1 code implementation • ICLR 2020 • Haoran You, Chaojian Li, Pengfei Xu, Yonggan Fu, Yue Wang, Xiaohan Chen, Richard G. Baraniuk, Zhangyang Wang, Yingyan Lin
Finally, we leverage the existence of EB tickets and the proposed mask distance to develop efficient training methods, which are achieved by first identifying EB tickets via low-cost schemes, and then continuing to train merely the EB tickets towards the target accuracy.
2 code implementations • 26 Sep 2019 • Haoran You, Chaojian Li, Pengfei Xu, Yonggan Fu, Yue Wang, Xiaohan Chen, Richard G. Baraniuk, Zhangyang Wang, Yingyan Lin
In this paper, we discover for the first time that the winning tickets can be identified at the very early training stage, which we term as early-bird (EB) tickets, via low-cost training schemes (e. g., early stopping and low-precision training) at large learning rates.
1 code implementation • 19 Nov 2018 • Haoran You, Yu Cheng, Tianheng Cheng, Chunliang Li, Pan Zhou
We evaluate the proposed Bayesian CycleGAN on multiple benchmark datasets, including Cityscapes, Maps, and Monet2photo.