Search Results for author: Yonggan Fu

Found 44 papers, 19 papers with code

Omni-Recon: Towards General-Purpose Neural Radiance Fields for Versatile 3D Applications

no code implementations17 Mar 2024 Yonggan Fu, Huaizhi Qu, Zhifan Ye, Chaojian Li, Kevin Zhao, Yingyan Lin

Specifically, our Omni-Recon features a general-purpose NeRF model using image-based rendering with two decoupled branches: one complex transformer-based branch that progressively fuses geometry and appearance features for accurate geometry estimation, and one lightweight branch for predicting blending weights of source views.

3D Reconstruction Scene Understanding +1

Towards Cognitive AI Systems: a Survey and Prospective on Neuro-Symbolic AI

no code implementations2 Jan 2024 Zishen Wan, Che-Kai Liu, Hanchen Yang, Chaojian Li, Haoran You, Yonggan Fu, Cheng Wan, Tushar Krishna, Yingyan Lin, Arijit Raychowdhury

The remarkable advancements in artificial intelligence (AI), primarily driven by deep neural networks, have significantly impacted various aspects of our lives.

NetDistiller: Empowering Tiny Deep Learning via In-Situ Distillation

no code implementations24 Oct 2023 Shunyao Zhang, Yonggan Fu, Shang Wu, Jyotikrishna Dass, Haoran You, Yingyan, Lin

To this end, we propose a framework called NetDistiller to boost the achievable accuracy of TNNs by treating them as sub-networks of a weight-sharing teacher constructed by expanding the number of channels of the TNN.

GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models

no code implementations19 Sep 2023 Yonggan Fu, Yongan Zhang, Zhongzhi Yu, Sixu Li, Zhifan Ye, Chaojian Li, Cheng Wan, Yingyan Lin

To our knowledge, this work is the first to demonstrate an effective pipeline for LLM-powered automated AI accelerator generation.

In-Context Learning

Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning

no code implementations23 Jun 2023 Zhongzhi Yu, Yang Zhang, Kaizhi Qian, Yonggan Fu, Yingyan Lin

Despite the impressive performance recently achieved by automatic speech recognition (ASR), we observe two primary challenges that hinder its broader applications: (1) The difficulty of introducing scalability into the model to support more languages with limited training, inference, and storage overhead; (2) The low-resource adaptation ability that enables effective low-resource adaptation while avoiding over-fitting and catastrophic forgetting issues.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

NetBooster: Empowering Tiny Deep Learning By Standing on the Shoulders of Deep Giants

no code implementations23 Jun 2023 Zhongzhi Yu, Yonggan Fu, Jiayi Yuan, Haoran You, Yingyan Lin

Tiny deep learning has attracted increasing attention driven by the substantial demand for deploying deep learning on numerous intelligent Internet-of-Things devices.

NeRFool: Uncovering the Vulnerability of Generalizable Neural Radiance Fields against Adversarial Perturbations

1 code implementation10 Jun 2023 Yonggan Fu, Ye Yuan, Souvik Kundu, Shang Wu, Shunyao Zhang, Yingyan Lin

Generalizable Neural Radiance Fields (GNeRF) are one of the most promising real-world solutions for novel view synthesis, thanks to their cross-scene generalization capability and thus the possibility of instant rendering on new scenes.

Adversarial Robustness Novel View Synthesis

Hint-Aug: Drawing Hints from Foundation Vision Transformers Towards Boosted Few-Shot Parameter-Efficient Tuning

1 code implementation CVPR 2023 Zhongzhi Yu, Shang Wu, Yonggan Fu, Shunyao Zhang, Yingyan Lin

To tackle this challenge, we first identify an opportunity for FViTs in few-shot tuning: pretrained FViTs themselves have already learned highly representative features from large-scale pretraining data, which are fully preserved during widely used parameter-efficient tuning.

Data Augmentation

Robust Tickets Can Transfer Better: Drawing More Transferable Subnetworks in Transfer Learning

no code implementations24 Apr 2023 Yonggan Fu, Ye Yuan, Shang Wu, Jiayi Yuan, Yingyan Lin

Transfer learning leverages feature representations of deep neural networks (DNNs) pretrained on source tasks with rich data to empower effective finetuning on downstream tasks.

Adversarial Robustness Transfer Learning

Auto-CARD: Efficient and Robust Codec Avatar Driving for Real-time Mobile Telepresence

no code implementations CVPR 2023 Yonggan Fu, Yuecheng Li, Chenghui Li, Jason Saragih, Peizhao Zhang, Xiaoliang Dai, Yingyan Lin

Real-time and robust photorealistic avatars for telepresence in AR/VR have been highly desired for enabling immersive photorealistic telepresence.

Neural Architecture Search

Gen-NeRF: Efficient and Generalizable Neural Radiance Fields via Algorithm-Hardware Co-Design

no code implementations24 Apr 2023 Yonggan Fu, Zhifan Ye, Jiayi Yuan, Shunyao Zhang, Sixu Li, Haoran You, Yingyan Lin

Novel view synthesis is an essential functionality for enabling immersive experiences in various Augmented- and Virtual-Reality (AR/VR) applications, for which generalizable Neural Radiance Fields (NeRFs) have gained increasing popularity thanks to their cross-scene generalization capability.

Generalizable Novel View Synthesis Novel View Synthesis

e-G2C: A 0.14-to-8.31 $μ$J/Inference NN-based Processor with Continuous On-chip Adaptation for Anomaly Detection and ECG Conversion from EGM

no code implementations24 Jul 2022 Yang Zhao, Yongan Zhang, Yonggan Fu, Xu Ouyang, Cheng Wan, Shang Wu, Anton Banta, Mathews M. John, Allison Post, Mehdi Razavi, Joseph Cavallaro, Behnaam Aazhang, Yingyan Lin

This work presents the first silicon-validated dedicated EGM-to-ECG (G2C) processor, dubbed e-G2C, featuring continuous lightweight anomaly detection, event-driven coarse/precise conversion, and on-chip adaptation.

Anomaly Detection

DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks

1 code implementation2 Jun 2022 Yonggan Fu, Haichuan Yang, Jiayi Yuan, Meng Li, Cheng Wan, Raghuraman Krishnamoorthi, Vikas Chandra, Yingyan Lin

Efficient deep neural network (DNN) models equipped with compact operators (e. g., depthwise convolutions) have shown great potential in reducing DNNs' theoretical complexity (e. g., the total number of weights/operations) while maintaining a decent model accuracy.

ShiftAddNAS: Hardware-Inspired Search for More Accurate and Efficient Neural Networks

1 code implementation17 May 2022 Haoran You, Baopu Li, Huihong Shi, Yonggan Fu, Yingyan Lin

To this end, this work advocates hybrid NNs that consist of both powerful yet costly multiplications and efficient yet less powerful operators for marrying the best of both worlds, and proposes ShiftAddNAS, which can automatically search for more accurate and more efficient NNs.

Patch-Fool: Are Vision Transformers Always Robust Against Adversarial Perturbations?

1 code implementation ICLR 2022 Yonggan Fu, Shunyao Zhang, Shang Wu, Cheng Wan, Yingyan Lin

In particular, recent works show that ViTs are more robust against adversarial attacks as compared with convolutional neural networks (CNNs), and conjecture that this is because ViTs focus more on capturing global interactions among different input/feature patches, leading to their improved robustness to local perturbations imposed by adversarial attacks.

LDP: Learnable Dynamic Precision for Efficient Deep Neural Network Training and Inference

no code implementations15 Mar 2022 Zhongzhi Yu, Yonggan Fu, Shang Wu, Mengquan Li, Haoran You, Yingyan Lin

While existing works mostly fix the model precision during the whole training process, a few pioneering works have shown that dynamic precision schedules help DNNs converge to a better accuracy while leading to a lower training cost than their static precision training counterparts.

MIA-Former: Efficient and Robust Vision Transformers via Multi-grained Input-Adaptation

no code implementations21 Dec 2021 Zhongzhi Yu, Yonggan Fu, Sicheng Li, Chaojian Li, Yingyan Lin

ViTs are often too computationally expensive to be fitted onto real-world resource-constrained devices, due to (1) their quadratically increased complexity with the number of input tokens and (2) their overparameterized self-attention heads and model depth.

RT-RCG: Neural Network and Accelerator Search Towards Effective and Real-time ECG Reconstruction from Intracardiac Electrograms

no code implementations4 Nov 2021 Yongan Zhang, Anton Banta, Yonggan Fu, Mathews M. John, Allison Post, Mehdi Razavi, Joseph Cavallaro, Behnaam Aazhang, Yingyan Lin

To close this gap and make a heuristic step towards real-time critical intervention in instant response to irregular and infrequent ventricular rhythms, we propose a new framework dubbed RT-RCG to automatically search for (1) efficient Deep Neural Network (DNN) structures and then (2)corresponding accelerators, to enable Real-Time and high-quality Reconstruction of ECG signals from EGM signals.

Navigate Neural Architecture Search

Drawing Robust Scratch Tickets: Subnetworks with Inborn Robustness Are Found within Randomly Initialized Networks

1 code implementation NeurIPS 2021 Yonggan Fu, Qixuan Yu, Yang Zhang, Shang Wu, Xu Ouyang, David Cox, Yingyan Lin

Deep Neural Networks (DNNs) are known to be vulnerable to adversarial attacks, i. e., an imperceptible perturbation to the input can mislead DNNs trained on clean images into making erroneous predictions.

Adversarial Robustness

D$^2$-GCN: Data-Dependent GCNs for Boosting Both Efficiency and Scalability

no code implementations29 Sep 2021 Chaojian Li, Xu Ouyang, Yang Zhao, Haoran You, Yonggan Fu, Yuchen Gu, Haonan Liu, Siyuan Miao, Yingyan Lin

Graph Convolutional Networks (GCNs) have gained an increasing attention thanks to their state-of-the-art (SOTA) performance in graph-based learning tasks.

Contrastive Quant: Quantization Makes Stronger Contrastive Learning

no code implementations29 Sep 2021 Yonggan Fu, Qixuan Yu, Meng Li, Xu Ouyang, Vikas Chandra, Yingyan Lin

Contrastive learning, which learns visual representations by enforcing feature consistency under different augmented views, has emerged as one of the most effective unsupervised learning methods.

Contrastive Learning Quantization

G-CoS: GNN-Accelerator Co-Search Towards Both Better Accuracy and Efficiency

no code implementations18 Sep 2021 Yongan Zhang, Haoran You, Yonggan Fu, Tong Geng, Ang Li, Yingyan Lin

While end-to-end jointly optimizing GNNs and their accelerators is promising in boosting GNNs' inference efficiency and expediting the design process, it is still underexplored due to the vast and distinct design spaces of GNNs and their accelerators.

2-in-1 Accelerator: Enabling Random Precision Switch for Winning Both Adversarial Robustness and Efficiency

no code implementations11 Sep 2021 Yonggan Fu, Yang Zhao, Qixuan Yu, Chaojian Li, Yingyan Lin

The recent breakthroughs of deep neural networks (DNNs) and the advent of billions of Internet of Things (IoT) devices have excited an explosive demand for intelligent IoT devices equipped with domain-specific DNN accelerators.

Adversarial Robustness Quantization

O-HAS: Optical Hardware Accelerator Search for Boosting Both Acceleration Performance and Development Speed

no code implementations17 Aug 2021 Mengquan Li, Zhongzhi Yu, Yongan Zhang, Yonggan Fu, Yingyan Lin

The recent breakthroughs and prohibitive complexities of Deep Neural Networks (DNNs) have excited extensive interest in domain-specific DNN accelerators, among which optical DNN accelerators are particularly promising thanks to their unprecedented potential of achieving superior performance-per-watt.

DANCE: DAta-Network Co-optimization for Efficient Segmentation Model Training and Inference

no code implementations16 Jul 2021 Chaojian Li, Wuyang Chen, Yuchen Gu, Tianlong Chen, Yonggan Fu, Zhangyang Wang, Yingyan Lin

Semantic segmentation for scene understanding is nowadays widely demanded, raising significant challenges for the algorithm efficiency, especially its applications on resource-limited platforms.

Scene Understanding Segmentation +1

Auto-NBA: Efficient and Effective Search Over the Joint Space of Networks, Bitwidths, and Accelerators

1 code implementation11 Jun 2021 Yonggan Fu, Yongan Zhang, Yang Zhang, David Cox, Yingyan Lin

The key challenges include (1) the dilemma of whether to explode the memory consumption due to the huge joint space or achieve sub-optimal designs, (2) the discrete nature of the accelerator design space that is coupled yet different from that of the networks and bitwidths, and (3) the chicken and egg problem associated with network-accelerator co-search, i. e., co-search requires operation-wise hardware cost, which is lacking during search as the optimal accelerator depending on the whole network is still unknown during search.

A3C-S: Automated Agent Accelerator Co-Search towards Efficient Deep Reinforcement Learning

no code implementations11 Jun 2021 Yonggan Fu, Yongan Zhang, Chaojian Li, Zhongzhi Yu, Yingyan Lin

Driven by the explosive interest in applying deep reinforcement learning (DRL) agents to numerous real-time control and decision-making applications, there has been a growing demand to deploy DRL agents to empower daily-life intelligent devices, while the prohibitive complexity of DRL stands at odds with limited on-device resources.

Decision Making reinforcement-learning +1

InstantNet: Automated Generation and Deployment of Instantaneously Switchable-Precision Networks

1 code implementation22 Apr 2021 Yonggan Fu, Zhongzhi Yu, Yongan Zhang, Yifan Jiang, Chaojian Li, Yongyuan Liang, Mingchao Jiang, Zhangyang Wang, Yingyan Lin

The promise of Deep Neural Network (DNN) powered Internet of Thing (IoT) devices has motivated a tremendous demand for automated solutions to enable fast development and deployment of efficient (1) DNNs equipped with instantaneous accuracy-efficiency trade-off capability to accommodate the time-varying resources at IoT devices and (2) dataflows to optimize DNNs' execution efficiency on different devices.

HW-NAS-Bench:Hardware-Aware Neural Architecture Search Benchmark

1 code implementation19 Mar 2021 Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Yingyan Lin

To design HW-NAS-Bench, we carefully collected the measured/estimated hardware performance of all the networks in the search spaces of both NAS-Bench-201 and FBNet, on six hardware devices that fall into three categories (i. e., commercial edge devices, FPGA, and ASIC).

Hardware Aware Neural Architecture Search Neural Architecture Search

Early-Bird GCNs: Graph-Network Co-Optimization Towards More Efficient GCN Training and Inference via Drawing Early-Bird Lottery Tickets

2 code implementations1 Mar 2021 Haoran You, Zhihan Lu, Zijian Zhou, Yonggan Fu, Yingyan Lin

Experiments on various GCN models and datasets consistently validate our GEB finding and the effectiveness of our GEBT, e. g., our GEBT achieves up to 80. 2% ~ 85. 6% and 84. 6% ~ 87. 5% savings of GCN training and inference costs while offering a comparable or even better accuracy as compared to state-of-the-art methods.

Representation Learning

CPT: Efficient Deep Neural Network Training via Cyclic Precision

1 code implementation ICLR 2021 Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin

In this paper, we attempt to explore low-precision training from a new perspective as inspired by recent findings in understanding DNN training: we conjecture that DNNs' precision might have a similar effect as the learning rate during DNN training, and advocate dynamic precision along the training trajectory for further boosting the time/energy efficiency of DNN training.

Language Modelling

SmartDeal: Re-Modeling Deep Network Weights for Efficient Inference and Training

1 code implementation4 Jan 2021 Xiaohan Chen, Yang Zhao, Yue Wang, Pengfei Xu, Haoran You, Chaojian Li, Yonggan Fu, Yingyan Lin, Zhangyang Wang

Results show that: 1) applied to inference, SD achieves up to 2. 44x energy efficiency as evaluated via real hardware implementations; 2) applied to training, SD leads to 10. 56x and 4. 48x reduction in the storage and training energy, with negligible accuracy loss compared to state-of-the-art training baselines.

HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark

no code implementations ICLR 2021 Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Cong Hao, Yingyan Lin

To design HW-NAS-Bench, we carefully collected the measured/estimated hardware performance (e. g., energy cost and latency) of all the networks in the search space of both NAS-Bench-201 and FBNet, considering six hardware devices that fall into three categories (i. e., commercial edge devices, FPGA, and ASIC).

Hardware Aware Neural Architecture Search Neural Architecture Search

SACoD: Sensor Algorithm Co-Design Towards Efficient CNN-powered Intelligent PhlatCam

1 code implementation ICCV 2021 Yonggan Fu, Yang Zhang, Yue Wang, Zhihan Lu, Vivek Boominathan, Ashok Veeraraghavan, Yingyan Lin

PhlatCam, with its form factor potentially reduced by orders of magnitude, has emerged as a promising solution to the first aforementioned challenge, while the second one remains a bottleneck.

Benchmarking Model Compression +1

Triple-Search: Differentiable Joint-Search of Networks, Precision, and Accelerators

no code implementations1 Jan 2021 Yonggan Fu, Yongan Zhang, Haoran You, Yingyan Lin

First, to jointly search for a network and its precision via differentiable search, there exists a dilemma of whether to explode the memory consumption or achieve sub-optimal designs.

FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training

1 code implementation NeurIPS 2020 Yonggan Fu, Haoran You, Yang Zhao, Yue Wang, Chaojian Li, Kailash Gopalakrishnan, Zhangyang Wang, Yingyan Lin

Recent breakthroughs in deep neural networks (DNNs) have fueled a tremendous demand for intelligent edge devices featuring on-site learning, while the practical realization of such systems remains a challenge due to the limited resources available at the edge and the required massive training costs for state-of-the-art (SOTA) DNNs.

Quantization

Auto-Agent-Distiller: Towards Efficient Deep Reinforcement Learning Agents via Neural Architecture Search

no code implementations24 Dec 2020 Yonggan Fu, Zhongzhi Yu, Yongan Zhang, Yingyan Lin

We therefore propose an Auto-Agent-Distiller (A2D) framework, which to our best knowledge is the first neural architecture search (NAS) applied to DRL to automatically search for the optimal DRL agents for various tasks that optimize both the test scores and efficiency.

Neural Architecture Search reinforcement-learning +1

DNA: Differentiable Network-Accelerator Co-Search

no code implementations28 Oct 2020 Yongan Zhang, Yonggan Fu, Weiwen Jiang, Chaojian Li, Haoran You, Meng Li, Vikas Chandra, Yingyan Lin

Powerful yet complex deep neural networks (DNNs) have fueled a booming demand for efficient DNN solutions to bring DNN-powered intelligence into numerous applications.

AutoGAN-Distiller: Searching to Compress Generative Adversarial Networks

3 code implementations ICML 2020 Yonggan Fu, Wuyang Chen, Haotao Wang, Haoran Li, Yingyan Lin, Zhangyang Wang

Inspired by the recent success of AutoML in deep compression, we introduce AutoML to GAN compression and develop an AutoGAN-Distiller (AGD) framework.

AutoML Knowledge Distillation +2

SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost Computation

no code implementations7 May 2020 Yang Zhao, Xiaohan Chen, Yue Wang, Chaojian Li, Haoran You, Yonggan Fu, Yuan Xie, Zhangyang Wang, Yingyan Lin

We present SmartExchange, an algorithm-hardware co-design framework to trade higher-cost memory storage/access for lower-cost computation, for energy-efficient inference of deep neural networks (DNNs).

Model Compression Quantization

Drawing Early-Bird Tickets: Toward More Efficient Training of Deep Networks

1 code implementation ICLR 2020 Haoran You, Chaojian Li, Pengfei Xu, Yonggan Fu, Yue Wang, Xiaohan Chen, Richard G. Baraniuk, Zhangyang Wang, Yingyan Lin

Finally, we leverage the existence of EB tickets and the proposed mask distance to develop efficient training methods, which are achieved by first identifying EB tickets via low-cost schemes, and then continuing to train merely the EB tickets towards the target accuracy.

Fractional Skipping: Towards Finer-Grained Dynamic CNN Inference

1 code implementation3 Jan 2020 Jianghao Shen, Yonggan Fu, Yue Wang, Pengfei Xu, Zhangyang Wang, Yingyan Lin

The core idea of DFS is to hypothesize layer-wise quantization (to different bitwidths) as intermediate "soft" choices to be made between fully utilizing and skipping a layer.

Quantization

Drawing Early-Bird Tickets: Towards More Efficient Training of Deep Networks

2 code implementations26 Sep 2019 Haoran You, Chaojian Li, Pengfei Xu, Yonggan Fu, Yue Wang, Xiaohan Chen, Richard G. Baraniuk, Zhangyang Wang, Yingyan Lin

In this paper, we discover for the first time that the winning tickets can be identified at the very early training stage, which we term as early-bird (EB) tickets, via low-cost training schemes (e. g., early stopping and low-precision training) at large learning rates.

Cannot find the paper you are looking for? You can Submit a new open access paper.