A Transferable Approach for Partitioning Machine Learning Models on Multi-Chip-Modules

no code implementations7 Dec 2021 Xinfeng Xie, Prakash Prabhu, Ulysse Beaugnon, Phitchaya Mangpo Phothilimthana, Sudip Roy, Azalia Mirhoseini, Eugene Brevdo, James Laudon, Yanqi Zhou

Partitioning ML graphs for MCMs is particularly hard as the search space grows exponentially with the number of chiplets available and the number of nodes in the neural network.

BIG-bench Machine Learning Reinforcement Learning (RL) +1

MPU: Towards Bandwidth-abundant SIMT Processor via Near-bank Computing

no code implementations11 Mar 2021 Xinfeng Xie, Peng Gu, Yufei Ding, Dimin Niu, Hongzhong Zheng, Yuan Xie

For general purpose scenarios, lightweight hardware designs for diverse data paths, architectural supports for the SIMT programming model, and end-to-end software optimizations remain challenging.

Hardware Architecture

Rubik: A Hierarchical Architecture for Efficient Graph Learning

no code implementations26 Sep 2020 Xiaobing Chen, yuke wang, Xinfeng Xie, Xing Hu, Abanti Basak, Ling Liang, Mingyu Yan, Lei Deng, Yufei Ding, Zidong Du, Yunji Chen, Yuan Xie

Graph convolutional network (GCN) emerges as a promising direction to learn the inductive representation in graph data commonly used in widespread applications, such as E-commerce, social networks, and knowledge graphs.

Hardware Architecture

QGAN: Quantize Generative Adversarial Networks to Extreme low-bits

no code implementations25 Sep 2019 Peiqi Wang, Yu Ji, Xinfeng Xie, Yongqiang Lyu, Dongsheng Wang, Yuan Xie

Despite the success in model reduction of convolutional neural networks (CNNs), neural network quantization methods have not yet been studied on GANs, which are mainly faced with the issues of both the effectiveness of quantization algorithms and the instability of training GAN models.


Neural Network Model Extraction Attacks in Edge Devices by Hearing Architectural Hints

no code implementations10 Mar 2019 Xing Hu, Ling Liang, Lei Deng, Shuangchen Li, Xinfeng Xie, Yu Ji, Yufei Ding, Chang Liu, Timothy Sherwood, Yuan Xie

As neural networks continue their reach into nearly every aspect of software operations, the details of those networks become an increasingly sensitive subject.

Cryptography and Security Hardware Architecture

FPSA: A Full System Stack Solution for Reconfigurable ReRAM-based NN Accelerator Architecture

no code implementations28 Jan 2019 Yu Ji, Youyang Zhang, Xinfeng Xie, Shuangchen Li, Peiqi Wang, Xing Hu, Youhui Zhang, Yuan Xie

In this paper, we propose a full system stack solution, composed of a reconfigurable architecture design, Field Programmable Synapse Array (FPSA) and its software system including neural synthesizer, temporal-to-spatial mapper, and placement & routing.

QGAN: Quantized Generative Adversarial Networks

no code implementations24 Jan 2019 Peiqi Wang, Dongsheng Wang, Yu Ji, Xinfeng Xie, Haoxuan Song, XuXin Liu, Yongqiang Lyu, Yuan Xie

The intensive computation and memory requirements of generative adversarial neural networks (GANs) hinder its real-world deployment on edge devices such as smartphones.


HitNet: Hybrid Ternary Recurrent Neural Network

no code implementations NeurIPS 2018 Peiqi Wang, Xinfeng Xie, Lei Deng, Guoqi Li, Dongsheng Wang, Yuan Xie

For example, we improve the perplexity per word (PPW) of a ternary LSTM on Penn Tree Bank (PTB) corpus from 126 (the state-of-the-art result to the best of our knowledge) to 110. 3 with a full precision model in 97. 2, and a ternary GRU from 142 to 113. 5 with a full precision model in 102. 7.


