Search Results for author: Runsheng Wang

Found 12 papers, 1 papers with code

EasyACIM: An End-to-End Automated Analog CIM with Synthesizable Architecture and Agile Design Space Exploration

no code implementations12 Apr 2024 Haoyi Zhang, Jiahao Song, Xiaohan Gao, Xiyuan Tang, Yibo Lin, Runsheng Wang, Ru Huang

Leveraging the multi-objective genetic algorithm (MOGA)-based design space explorer, EasyACIM can obtain high-quality ACIM solutions based on the proposed synthesizable architecture, targeting versatile application scenarios.

Edge-computing

PDNNet: PDN-Aware GNN-CNN Heterogeneous Network for Dynamic IR Drop Prediction

no code implementations27 Mar 2024 Yuxiang Zhao, Zhuomin Chai, Xun Jiang, Yibo Lin, Runsheng Wang, Ru Huang

We are the first work to apply graph structure to deep-learning based dynamic IR drop prediction method.

ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding

no code implementations21 Feb 2024 Shuzhang Zhong, Zebin Yang, Meng Li, Ruihao Gong, Runsheng Wang, Ru Huang

Additionally, it introduces a dynamic token tree generation algorithm to balance the computation and parallelism of the verification phase in real-time and maximize the overall efficiency across different batch sizes, sequence lengths, and tasks, etc.

ASCEND: Accurate yet Efficient End-to-End Stochastic Computing Acceleration of Vision Transformer

no code implementations20 Feb 2024 Tong Xie, Yixuan Hu, Renjie Wei, Meng Li, YuAn Wang, Runsheng Wang, Ru Huang

To overcome the compatibility challenges, ASCEND proposes a novel deterministic SC block for GELU and leverages an SC-friendly iterative approximate algorithm to design an accurate and efficient softmax circuit.

HEQuant: Marrying Homomorphic Encryption and Quantization for Communication-Efficient Private Inference

no code implementations29 Jan 2024 Tianshi Xu, Meng Li, Runsheng Wang

Compared with prior-art HE-based protocols, e. g., CrypTFlow2, Cheetah, Iron, etc, HEQuant achieves $3. 5\sim 23. 4\times$ communication reduction and $3. 0\sim 9. 3\times$ latency reduction.

Quantization

Memory-aware Scheduling for Complex Wired Networks with Iterative Graph Optimization

no code implementations26 Aug 2023 Shuzhang Zhong, Meng Li, Yun Liang, Runsheng Wang, Ru Huang

Memory-aware network scheduling is becoming increasingly important for deep neural network (DNN) inference on resource-constrained devices.

Scheduling

Falcon: Accelerating Homomorphically Encrypted Convolutions for Efficient Private Mobile Network Inference

no code implementations25 Aug 2023 Tianshi Xu, Meng Li, Runsheng Wang, Ru Huang

Efficient networks, e. g., MobileNetV2, EfficientNet, etc, achieves state-of-the-art (SOTA) accuracy with lightweight computation.

HybridNet: Dual-Branch Fusion of Geometrical and Topological Views for VLSI Congestion Prediction

no code implementations7 May 2023 Yuxiang Zhao, Zhuomin Chai, Yibo Lin, Runsheng Wang, Ru Huang

Accurate early congestion prediction can prevent unpleasant surprises at the routing stage, playing a crucial character in assisting designers to iterate faster in VLSI design cycles.

EBSR: Enhanced Binary Neural Network for Image Super-Resolution

no code implementations22 Mar 2023 Renjie Wei, Shuwen Zhang, Zechun Liu, Meng Li, Yuchen Fan, Runsheng Wang, Ru Huang

While the performance of deep convolutional neural networks for image super-resolution (SR) has improved significantly, the rapid increase of memory and computation requirements hinders their deployment on resource-constrained devices.

Binarization Image Super-Resolution +1

CircuitNet: An Open-Source Dataset for Machine Learning Applications in Electronic Design Automation (EDA)

no code implementations1 Aug 2022 Zhuomin Chai, Yuxiang Zhao, Yibo Lin, Wei Liu, Runsheng Wang, Ru Huang

The electronic design automation (EDA) community has been actively exploring machine learning (ML) for very large-scale integrated computer-aided design (VLSI CAD).

BIG-bench Machine Learning

DaSGD: Squeezing SGD Parallelization Performance in Distributed Training Using Delayed Averaging

no code implementations31 May 2020 Qinggang Zhou, Yawen Zhang, Pengcheng Li, Xiaoyong Liu, Jun Yang, Runsheng Wang, Ru Huang

The state-of-the-art deep learning algorithms rely on distributed training systems to tackle the increasing sizes of models and training data sets.

Cannot find the paper you are looking for? You can Submit a new open access paper.