no code implementations • 11 Dec 2023 • Zheyu Yan, Xiaobo Sharon Hu, Yiyu Shi
In this study, we define the problem of pinpointing the worst-case performance of CiM DNN accelerators affected by device variations.
no code implementations • 11 Dec 2023 • Zheyu Yan, Xiaobo Sharon Hu, Yiyu Shi
In our research, we illustrate that only a small fraction of weights need this write-verify treatment for the corresponding devices and the DNN accuracy can be preserved, yielding a notable programming acceleration.
no code implementations • 29 Jul 2023 • Zheyu Yan, Yifan Qin, Wujie Wen, Xiaobo Sharon Hu, Yiyu Shi
In this work, we propose to use the k-th percentile performance (KPP) to capture the realistic worst-case performance of DNN models executing on CiM accelerators.
no code implementations • 12 Jun 2023 • Zheyu Yan, Yifan Qin, Xiaobo Sharon Hu, Yiyu Shi
In this study, we present a novel approach that leverages Large Language Models (LLMs) to address this issue.
1 code implementation • 23 May 2023 • Yifan Qin, Zheyu Yan, Wujie Wen, Xiaobo Sharon Hu, Yiyu Shi
However, the stochastic nature and intrinsic variations of NVM devices often result in performance degradation in DNN inference.
2 code implementations • 9 Sep 2022 • Jing Gong, Hassaan Saadat, Hasindu Gamaarachchi, Haris Javaid, Xiaobo Sharon Hu, Sri Parameswaran
Compared to CPU-based approximate multiplier simulations in training and inference, the GPU-accelerated ApproxTrain is more than 2500x faster.
no code implementations • 15 Jul 2022 • Zheyu Yan, Xiaobo Sharon Hu, Yiyu Shi
In this work, we formulate the problem of determining the worst-case performance of CiM DNN accelerators under the impact of device variations.
1 code implementation • 17 Feb 2022 • Zheyu Yan, Xiaobo Sharon Hu, Yiyu Shi
In this work, we show that it is only necessary to select a small portion of the weights for write-verify to maintain the DNN accuracy, thus achieving significant speedup.
no code implementations • 13 Sep 2021 • Zheyu Yan, Weiwen Jiang, Xiaobo Sharon Hu, Yiyu Shi
To the best of the authors' knowledge, this is the first DNAS framework that can handle large search spaces with bounded memory usage.
no code implementations • 6 Jul 2021 • Zheyu Yan, Da-Cheng Juan, Xiaobo Sharon Hu, Yiyu Shi
Emerging device-based Computing-in-memory (CiM) has been proved to be a promising candidate for high-energy efficiency deep neural network (DNN) computations.
no code implementations • 31 Oct 2019 • Weiwen Jiang, Qiuwen Lou, Zheyu Yan, Lei Yang, Jingtong Hu, Xiaobo Sharon Hu, Yiyu Shi
In this paper, we are the first to bring the computing-in-memory architecture, which can easily transcend the memory wall, to interplay with the neural architecture search, aiming to find the most efficient neural architectures with high network accuracy and maximized hardware efficiency.
no code implementations • 29 May 2017 • Xiaoming Chen, Jianxu Chen, Danny Z. Chen, Xiaobo Sharon Hu
The high computation throughput and memory bandwidth of graphics processing units (GPUs) make GPUs a natural choice for accelerating convolution operations.