Search Results for author: Runsheng Wang

Found 15 papers, 1 papers with code

ChatEMG: Synthetic Data Generation to Control a Robotic Hand Orthosis for Stroke

no code implementations17 Jun 2024 Jingxi Xu, Runsheng Wang, Siqi Shang, Ava Chen, Lauren Winterbottom, To-Liang Hsu, Wenxi Chen, Khondoker Ahmed, Pedro Leandro La Rotta, Xinyue Zhu, Dawn M. Nilsen, Joel Stein, Matei Ciocarlie

In this paper, we propose ChatEMG, an autoregressive generative model that can generate synthetic EMG signals conditioned on prompts (i. e., a given sequence of EMG signals).

Synthetic Data Generation

FastQuery: Communication-efficient Embedding Table Query for Private LLM Inference

no code implementations25 May 2024 Chenqi Lin, Tianshi Xu, Zebin Yang, Runsheng Wang, Ru Huang, Meng Li

We observe the overhead mainly comes from the neglect of 1) the one-hot nature of user queries and 2) the robustness of the embedding table to low bit-width quantization noise.

Quantization

PrivCirNet: Efficient Private Inference via Block Circulant Transformation

no code implementations23 May 2024 Tianshi Xu, Lemeng Wu, Runsheng Wang, Meng Li

Homomorphic encryption (HE)-based deep neural network (DNN) inference protects data and model privacy but suffers from significant computation overhead.

EasyACIM: An End-to-End Automated Analog CIM with Synthesizable Architecture and Agile Design Space Exploration

no code implementations12 Apr 2024 Haoyi Zhang, Jiahao Song, Xiaohan Gao, Xiyuan Tang, Yibo Lin, Runsheng Wang, Ru Huang

Leveraging the multi-objective genetic algorithm (MOGA)-based design space explorer, EasyACIM can obtain high-quality ACIM solutions based on the proposed synthesizable architecture, targeting versatile application scenarios.

Edge-computing

PDNNet: PDN-Aware GNN-CNN Heterogeneous Network for Dynamic IR Drop Prediction

no code implementations27 Mar 2024 Yuxiang Zhao, Zhuomin Chai, Xun Jiang, Yibo Lin, Runsheng Wang, Ru Huang

We are the first work to apply graph structure to deep-learning based dynamic IR drop prediction method.

ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding

no code implementations21 Feb 2024 Shuzhang Zhong, Zebin Yang, Meng Li, Ruihao Gong, Runsheng Wang, Ru Huang

Additionally, it introduces a dynamic token tree generation algorithm to balance the computation and parallelism of the verification phase in real-time and maximize the overall efficiency across different batch sizes, sequence lengths, and tasks, etc.

ASCEND: Accurate yet Efficient End-to-End Stochastic Computing Acceleration of Vision Transformer

no code implementations20 Feb 2024 Tong Xie, Yixuan Hu, Renjie Wei, Meng Li, YuAn Wang, Runsheng Wang, Ru Huang

To overcome the compatibility challenges, ASCEND proposes a novel deterministic SC block for GELU and leverages an SC-friendly iterative approximate algorithm to design an accurate and efficient softmax circuit.

HEQuant: Marrying Homomorphic Encryption and Quantization for Communication-Efficient Private Inference

no code implementations29 Jan 2024 Tianshi Xu, Meng Li, Runsheng Wang

Compared with prior-art HE-based protocols, e. g., CrypTFlow2, Cheetah, Iron, etc, HEQuant achieves $3. 5\sim 23. 4\times$ communication reduction and $3. 0\sim 9. 3\times$ latency reduction.

Quantization

Memory-aware Scheduling for Complex Wired Networks with Iterative Graph Optimization

no code implementations26 Aug 2023 Shuzhang Zhong, Meng Li, Yun Liang, Runsheng Wang, Ru Huang

Memory-aware network scheduling is becoming increasingly important for deep neural network (DNN) inference on resource-constrained devices.

Scheduling

Falcon: Accelerating Homomorphically Encrypted Convolutions for Efficient Private Mobile Network Inference

no code implementations25 Aug 2023 Tianshi Xu, Meng Li, Runsheng Wang, Ru Huang

Efficient networks, e. g., MobileNetV2, EfficientNet, etc, achieves state-of-the-art (SOTA) accuracy with lightweight computation.

HybridNet: Dual-Branch Fusion of Geometrical and Topological Views for VLSI Congestion Prediction

no code implementations7 May 2023 Yuxiang Zhao, Zhuomin Chai, Yibo Lin, Runsheng Wang, Ru Huang

Accurate early congestion prediction can prevent unpleasant surprises at the routing stage, playing a crucial character in assisting designers to iterate faster in VLSI design cycles.

EBSR: Enhanced Binary Neural Network for Image Super-Resolution

no code implementations22 Mar 2023 Renjie Wei, Shuwen Zhang, Zechun Liu, Meng Li, Yuchen Fan, Runsheng Wang, Ru Huang

While the performance of deep convolutional neural networks for image super-resolution (SR) has improved significantly, the rapid increase of memory and computation requirements hinders their deployment on resource-constrained devices.

Binarization Image Super-Resolution +1

CircuitNet: An Open-Source Dataset for Machine Learning Applications in Electronic Design Automation (EDA)

no code implementations1 Aug 2022 Zhuomin Chai, Yuxiang Zhao, Yibo Lin, Wei Liu, Runsheng Wang, Ru Huang

The electronic design automation (EDA) community has been actively exploring machine learning (ML) for very large-scale integrated computer-aided design (VLSI CAD).

BIG-bench Machine Learning

DaSGD: Squeezing SGD Parallelization Performance in Distributed Training Using Delayed Averaging

no code implementations31 May 2020 Qinggang Zhou, Yawen Zhang, Pengcheng Li, Xiaoyong Liu, Jun Yang, Runsheng Wang, Ru Huang

The state-of-the-art deep learning algorithms rely on distributed training systems to tackle the increasing sizes of models and training data sets.

Cannot find the paper you are looking for? You can Submit a new open access paper.