Search Results for author: Qingfeng Zhuge

Found 5 papers, 1 papers with code

BSC: Block-based Stochastic Computing to Enable Accurate and Efficient TinyML

no code implementations • 12 Nov 2021 • Yuhong Song, Edwin Hsing-Mean Sha, Qingfeng Zhuge, Rui Xu, Yongzhuo Zhang, Bingzhe Li, Lei Yang

Unlike ML on the edge, TinyML with a limited energy supply has higher demands on low-power execution.

Paper
Add Code

Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization

no code implementations • 19 Oct 2021 • Panjie Qi, Edwin Hsing-Mean Sha, Qingfeng Zhuge, Hongwu Peng, Shaoyi Huang, Zhenglun Kong, Yuhong Song, Bingbing Li

Our HP can achieve higher sparsity ratio and is more flexible than other sparsity pattern.

Model Compression

Paper
Add Code

Dancing along Battery: Enabling Transformer with Run-time Reconfigurability on Mobile Devices

no code implementations • 12 Feb 2021 • Yuhong Song, Weiwen Jiang, Bingbing Li, Panjie Qi, Qingfeng Zhuge, Edwin Hsing-Mean Sha, Sakyasingha Dasgupta, Yiyu Shi, Caiwen Ding

Specifically, RT3 integrates two-level optimizations: First, it utilizes an efficient BP as the first-step compression for resource-constrained mobile devices; then, RT3 heuristically generates a shrunken search space based on the first level optimization and searches multiple pattern sets with diverse sparsity for PP via reinforcement learning to support lightweight software reconfiguration, which corresponds to available frequency levels of DVFS (i. e., hardware reconfiguration).

AutoML

Paper
Add Code

Hardware/Software Co-Exploration of Neural Architectures

1 code implementation • 6 Jul 2019 • Weiwen Jiang, Lei Yang, Edwin Sha, Qingfeng Zhuge, Shouzhen Gu, Sakyasingha Dasgupta, Yiyu Shi, Jingtong Hu

We propose a novel hardware and software co-exploration framework for efficient neural architecture search (NAS).

Neural Architecture Search

Paper
Code

Accuracy vs. Efficiency: Achieving Both through FPGA-Implementation Aware Neural Architecture Search

no code implementations • 31 Jan 2019 • Weiwen Jiang, Xinyi Zhang, Edwin H. -M. Sha, Lei Yang, Qingfeng Zhuge, Yiyu Shi, Jingtong Hu

In addition, with a performance abstraction model to analyze the latency of neural architectures without training, our framework can quickly prune architectures that do not satisfy the specification, leading to higher efficiency.

Neural Architecture Search

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.