Search Results for author: Haoran You

Found 29 papers, 16 papers with code

HALO: Hardware-Aware Learning to Optimize

1 code implementation ECCV 2020 Chaojian Li, Tianlong Chen, Haoran You, Zhangyang Wang, Yingyan Lin

There has been an explosive demand for bringing machine learning (ML) powered intelligence into numerous Internet-of-Things (IoT) devices.

Towards Cognitive AI Systems: a Survey and Prospective on Neuro-Symbolic AI

no code implementations2 Jan 2024 Zishen Wan, Che-Kai Liu, Hanchen Yang, Chaojian Li, Haoran You, Yonggan Fu, Cheng Wan, Tushar Krishna, Yingyan Lin, Arijit Raychowdhury

The remarkable advancements in artificial intelligence (AI), primarily driven by deep neural networks, have significantly impacted various aspects of our lives.

NetDistiller: Empowering Tiny Deep Learning via In-Situ Distillation

no code implementations24 Oct 2023 Shunyao Zhang, Yonggan Fu, Shang Wu, Jyotikrishna Dass, Haoran You, Yingyan, Lin

To this end, we propose a framework called NetDistiller to boost the achievable accuracy of TNNs by treating them as sub-networks of a weight-sharing teacher constructed by expanding the number of channels of the TNN.

NetBooster: Empowering Tiny Deep Learning By Standing on the Shoulders of Deep Giants

no code implementations23 Jun 2023 Zhongzhi Yu, Yonggan Fu, Jiayi Yuan, Haoran You, Yingyan Lin

Tiny deep learning has attracted increasing attention driven by the substantial demand for deploying deep learning on numerous intelligent Internet-of-Things devices.

ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer

1 code implementation NeurIPS 2023 Haoran You, Huihong Shi, Yipin Guo, Yingyan Lin

To marry the best of both worlds, we further propose a new mixture of experts (MoE) framework to reparameterize MLPs by taking multiplication or its primitives as experts, e. g., multiplication and shift, and designing a new latency-aware load-balancing loss.

Efficient ViTs

Gen-NeRF: Efficient and Generalizable Neural Radiance Fields via Algorithm-Hardware Co-Design

no code implementations24 Apr 2023 Yonggan Fu, Zhifan Ye, Jiayi Yuan, Shunyao Zhang, Sixu Li, Haoran You, Yingyan Lin

Novel view synthesis is an essential functionality for enabling immersive experiences in various Augmented- and Virtual-Reality (AR/VR) applications, for which generalizable Neural Radiance Fields (NeRFs) have gained increasing popularity thanks to their cross-scene generalization capability.

Generalizable Novel View Synthesis Novel View Synthesis

Castling-ViT: Compressing Self-Attention via Switching Towards Linear-Angular Attention at Vision Transformer Inference

1 code implementation CVPR 2023 Haoran You, Yunyang Xiong, Xiaoliang Dai, Bichen Wu, Peizhao Zhang, Haoqi Fan, Peter Vajda, Yingyan Lin

Vision Transformers (ViTs) have shown impressive performance but still require a high computation cost as compared to convolutional neural networks (CNNs), one reason is that ViTs' attention measures global similarities and thus has a quadratic complexity with the number of input tokens.

Efficient ViTs

NASA: Neural Architecture Search and Acceleration for Hardware Inspired Hybrid Networks

2 code implementations24 Oct 2022 Huihong Shi, Haoran You, Yang Zhao, Zhongfeng Wang, Yingyan Lin

Multiplication is arguably the most cost-dominant operation in modern deep neural networks (DNNs), limiting their achievable efficiency and thus more extensive deployment in resource-constrained applications.

Neural Architecture Search

ViTCoD: Vision Transformer Acceleration via Dedicated Algorithm and Accelerator Co-Design

1 code implementation18 Oct 2022 Haoran You, Zhanyi Sun, Huihong Shi, Zhongzhi Yu, Yang Zhao, Yongan Zhang, Chaojian Li, Baopu Li, Yingyan Lin

Specifically, on the algorithm level, ViTCoD prunes and polarizes the attention maps to have either denser or sparser fixed patterns for regularizing two levels of workloads without hurting the accuracy, largely reducing the attention computations while leaving room for alleviating the remaining dominant data movements; on top of that, we further integrate a lightweight and learnable auto-encoder module to enable trading the dominant high-cost data movements for lower-cost computations.

SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via Jointly Architecture Searching and Parameter Pruning

1 code implementation8 Jul 2022 Haoran You, Baopu Li, Zhanyi Sun, Xu Ouyang, Yingyan Lin

In this paper, we discover for the first time that both efficient DNNs and their lottery subnetworks (i. e., lottery tickets) can be directly identified from a supernet, which we term as SuperTickets, via a two-in-one training scheme with jointly architecture searching and parameter pruning.

Neural Architecture Search

ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks

1 code implementation17 May 2022 Haoran You, Baopu Li, Huihong Shi, Yonggan Fu, Yingyan Lin

To this end, this work advocates hybrid NNs that consist of both powerful yet costly multiplications and efficient yet less powerful operators for marrying the best of both worlds, and proposes ShiftAddNAS, which can automatically search for more accurate and more efficient NNs.

LDP: Learnable Dynamic Precision for Efficient Deep Neural Network Training and Inference

no code implementations15 Mar 2022 Zhongzhi Yu, Yonggan Fu, Shang Wu, Mengquan Li, Haoran You, Yingyan Lin

While existing works mostly fix the model precision during the whole training process, a few pioneering works have shown that dynamic precision schedules help DNNs converge to a better accuracy while leading to a lower training cost than their static precision training counterparts.

I-GCN: A Graph Convolutional Network Accelerator with Runtime Locality Enhancement through Islandization

no code implementations7 Mar 2022 Tong Geng, Chunshu Wu, Yongan Zhang, Cheng Tan, Chenhao Xie, Haoran You, Martin C. Herbordt, Yingyan Lin, Ang Li

In this paper we propose a novel hardware accelerator for GCN inference, called I-GCN, that significantly improves data locality and reduces unnecessary computation.

D$^2$-GCN: Data-Dependent GCNs for Boosting Both Efficiency and Scalability

no code implementations29 Sep 2021 Chaojian Li, Xu Ouyang, Yang Zhao, Haoran You, Yonggan Fu, Yuchen Gu, Haonan Liu, Siyuan Miao, Yingyan Lin

Graph Convolutional Networks (GCNs) have gained an increasing attention thanks to their state-of-the-art (SOTA) performance in graph-based learning tasks.

G-CoS: GNN-Accelerator Co-Search Towards Both Better Accuracy and Efficiency

no code implementations18 Sep 2021 Yongan Zhang, Haoran You, Yonggan Fu, Tong Geng, Ang Li, Yingyan Lin

While end-to-end jointly optimizing GNNs and their accelerators is promising in boosting GNNs' inference efficiency and expediting the design process, it is still underexplored due to the vast and distinct design spaces of GNNs and their accelerators.

HW-NAS-Bench:Hardware-Aware Neural Architecture Search Benchmark

1 code implementation19 Mar 2021 Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Yingyan Lin

To design HW-NAS-Bench, we carefully collected the measured/estimated hardware performance of all the networks in the search spaces of both NAS-Bench-201 and FBNet, on six hardware devices that fall into three categories (i. e., commercial edge devices, FPGA, and ASIC).

Hardware Aware Neural Architecture Search Neural Architecture Search

Early-Bird GCNs: Graph-Network Co-Optimization Towards More Efficient GCN Training and Inference via Drawing Early-Bird Lottery Tickets

2 code implementations1 Mar 2021 Haoran You, Zhihan Lu, Zijian Zhou, Yonggan Fu, Yingyan Lin

Experiments on various GCN models and datasets consistently validate our GEB finding and the effectiveness of our GEBT, e. g., our GEBT achieves up to 80. 2% ~ 85. 6% and 84. 6% ~ 87. 5% savings of GCN training and inference costs while offering a comparable or even better accuracy as compared to state-of-the-art methods.

Representation Learning

Max-Affine Spline Insights Into Deep Network Pruning

no code implementations7 Jan 2021 Haoran You, Randall Balestriero, Zhihan Lu, Yutong Kou, Huihong Shi, Shunyao Zhang, Shang Wu, Yingyan Lin, Richard Baraniuk

In this paper, we study the importance of pruning in Deep Networks (DNs) and the yin & yang relationship between (1) pruning highly overparametrized DNs that have been trained from random initialization and (2) training small DNs that have been "cleverly" initialized.

Network Pruning

SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and Training

1 code implementation4 Jan 2021 Xiaohan Chen, Yang Zhao, Yue Wang, Pengfei Xu, Haoran You, Chaojian Li, Yonggan Fu, Yingyan Lin, Zhangyang Wang

Results show that: 1) applied to inference, SD achieves up to 2. 44x energy efficiency as evaluated via real hardware implementations; 2) applied to training, SD leads to 10. 56x and 4. 48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.

HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark

no code implementations ICLR 2021 Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Cong Hao, Yingyan Lin

To design HW-NAS-Bench, we carefully collected the measured/estimated hardware performance (e. g., energy cost and latency) of all the networks in the search space of both NAS-Bench-201 and FBNet, considering six hardware devices that fall into three categories (i. e., commercial edge devices, FPGA, and ASIC).

Hardware Aware Neural Architecture Search Neural Architecture Search

Triple-Search: Differentiable Joint-Search of Networks, Precision, and Accelerators

no code implementations1 Jan 2021 Yonggan Fu, Yongan Zhang, Haoran You, Yingyan Lin

First, to jointly search for a network and its precision via differentiable search, there exists a dilemma of whether to explode the memory consumption or achieve sub-optimal designs.

FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training

1 code implementation NeurIPS 2020 Yonggan Fu, Haoran You, Yang Zhao, Yue Wang, Chaojian Li, Kailash Gopalakrishnan, Zhangyang Wang, Yingyan Lin

Recent breakthroughs in deep neural networks (DNNs) have fueled a tremendous demand for intelligent edge devices featuring on-site learning, while the practical realization of such systems remains a challenge due to the limited resources available at the edge and the required massive training costs for state-of-the-art (SOTA) DNNs.

Quantization

DNA: Differentiable Network-Accelerator Co-Search

no code implementations28 Oct 2020 Yongan Zhang, Yonggan Fu, Weiwen Jiang, Chaojian Li, Haoran You, Meng Li, Vikas Chandra, Yingyan Lin

Powerful yet complex deep neural networks (DNNs) have fueled a booming demand for efficient DNN solutions to bring DNN-powered intelligence into numerous applications.

SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost Computation

no code implementations7 May 2020 Yang Zhao, Xiaohan Chen, Yue Wang, Chaojian Li, Haoran You, Yonggan Fu, Yuan Xie, Zhangyang Wang, Yingyan Lin

We present SmartExchange, an algorithm-hardware co-design framework to trade higher-cost memory storage/access for lower-cost computation, for energy-efficient inference of deep neural networks (DNNs).

Model Compression Quantization

Drawing Early-Bird Tickets: Toward More Efficient Training of Deep Networks

1 code implementation ICLR 2020 Haoran You, Chaojian Li, Pengfei Xu, Yonggan Fu, Yue Wang, Xiaohan Chen, Richard G. Baraniuk, Zhangyang Wang, Yingyan Lin

Finally, we leverage the existence of EB tickets and the proposed mask distance to develop efficient training methods, which are achieved by first identifying EB tickets via low-cost schemes, and then continuing to train merely the EB tickets towards the target accuracy.

Drawing Early-Bird Tickets: Towards More Efficient Training of Deep Networks

2 code implementations26 Sep 2019 Haoran You, Chaojian Li, Pengfei Xu, Yonggan Fu, Yue Wang, Xiaohan Chen, Richard G. Baraniuk, Zhangyang Wang, Yingyan Lin

In this paper, we discover for the first time that the winning tickets can be identified at the very early training stage, which we term as early-bird (EB) tickets, via low-cost training schemes (e. g., early stopping and low-precision training) at large learning rates.

Bayesian Cycle-Consistent Generative Adversarial Networks via Marginalizing Latent Sampling

1 code implementation19 Nov 2018 Haoran You, Yu Cheng, Tianheng Cheng, Chunliang Li, Pan Zhou

We evaluate the proposed Bayesian CycleGAN on multiple benchmark datasets, including Cityscapes, Maps, and Monet2photo.

Image-to-Image Translation Semantic Segmentation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.