Search Results for author: Yanzhi Wang

Found 161 papers, 36 papers with code

HybridFlow: Infusing Continuity into Masked Codebook for Extreme Low-Bitrate Image Compression

no code implementations • 20 Apr 2024 • Lei Lu, Yanyue Xie, Wei Jiang, Wei Wang, Xue Lin, Yanzhi Wang

This paper investigates the challenging problem of learned image compression (LIC) with extreme low bitrates.

Image Compression Image Reconstruction +1

Paper
Add Code

TextCraftor: Your Text Encoder Can be Image Quality Controller

no code implementations • 27 Mar 2024 • Yanyu Li, Xian Liu, Anil Kag, Ju Hu, Yerlan Idelbayev, Dhritiman Sagar, Yanzhi Wang, Sergey Tulyakov, Jian Ren

Our findings reveal that, instead of replacing the CLIP text encoder used in Stable Diffusion with other large language models, we can enhance it through our proposed fine-tuning approach, TextCraftor, leading to substantial improvements in quantitative benchmarks and human assessments.

Image Generation

Paper
Add Code

Efficient Pruning of Large Language Model with Adaptive Estimation Fusion

no code implementations • 16 Mar 2024 • Jun Liu, Chao Wu, Changdi Yang, Hao Tang, Haoye Dong, Zhenglun Kong, Geng Yuan, Wei Niu, Dong Huang, Yanzhi Wang

Large language models (LLMs) have become crucial for many generative downstream tasks, leading to an inevitable trend and significant challenge to deploy them efficiently on resource-constrained devices.

Language Modelling Large Language Model

Paper
Add Code

DiffClass: Diffusion-Based Class Incremental Learning

no code implementations • 8 Mar 2024 • Zichong Meng, Jie Zhang, Changdi Yang, Zheng Zhan, Pu Zhao, Yanzhi Wang

On top of that, Exemplar-free Class Incremental Learning is even more challenging due to forbidden access to previous task data.

Class Incremental Learning Domain Adaptation +2

Paper
Add Code

InstructGIE: Towards Generalizable Image Editing

no code implementations • 8 Mar 2024 • Zichong Meng, Changdi Yang, Jun Liu, Hao Tang, Pu Zhao, Yanzhi Wang

In response to this challenge, our study introduces a novel image editing framework with enhanced generalization robustness by boosting in-context learning capability and unifying language instruction.

Denoising In-Context Learning

Paper
Add Code

Dynamic Gaussian Graph Operator: Learning parametric partial differential equations in arbitrary discrete mechanics problems

no code implementations • 5 Mar 2024 • Chu Wang, Jinhong Wu, Yanzhi Wang, Zhijian Zha, Qi Zhou

Metric vectors are regarded as located on latent uniform domain, wherein spatial and spectral transformation offer highly regular constraints on solution space.

Operator learning

Paper
Add Code

MPIPN: A Multi Physics-Informed PointNet for solving parametric acoustic-structure systems

no code implementations • 2 Mar 2024 • Chu Wang, Jinhong Wu, Yanzhi Wang, Zhijian Zha, Qi Zhou

The framework is generalized to deal with new parametric conditions of systems.

Paper
Add Code

EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge

1 code implementation • 16 Feb 2024 • Xuan Shen, Zhenglun Kong, Changdi Yang, Zhaoyang Han, Lei Lu, Peiyan Dong, Cheng Lyu, Chih-hsiang Li, Xuehang Guo, Zhihao Shu, Wei Niu, Miriam Leeser, Pu Zhao, Yanzhi Wang

In this paper, we propose EdgeQAT, the Entropy and Distribution Guided QAT for the optimization of lightweight LLMs to achieve inference acceleration on Edge devices.

Quantization

Paper
Code

EdgeOL: Efficient in-situ Online Learning on Edge Devices

no code implementations • 30 Jan 2024 • Sheng Li, Geng Yuan, Yawen Wu, Yue Dai, Chao Wu, Alex K. Jones, Jingtong Hu, Yanzhi Wang, Xulong Tang

Emerging applications, such as robot-assisted eldercare and object recognition, generally employ deep learning neural networks (DNNs) and naturally require: i) handling streaming-in inference requests and ii) adapting to possible deployment scenario changes.

Object Recognition

Paper
Add Code

SmartFRZ: An Efficient Training Framework using Attention-Based Layer Freezing

no code implementations • 30 Jan 2024 • Sheng Li, Geng Yuan, Yue Dai, Youtao Zhang, Yanzhi Wang, Xulong Tang

Therefore, there lacks a generic and smart layer freezing method that can automatically perform ``in-situation'' layer freezing for different networks during training processes.

Paper
Add Code

E$^{2}$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation

no code implementations • 11 Jan 2024 • Yifan Gong, Zheng Zhan, Qing Jin, Yanyu Li, Yerlan Idelbayev, Xian Liu, Andrey Zharkov, Kfir Aberman, Sergey Tulyakov, Yanzhi Wang, Jian Ren

One highly promising direction for enabling flexible real-time on-device image editing is utilizing data distillation by leveraging large-scale text-to-image diffusion models, such as Stable Diffusion, to generate paired datasets used for training generative adversarial networks (GANs).

Image-to-Image Translation

Paper
Add Code

Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge

no code implementations • 9 Dec 2023 • Xuan Shen, Peiyan Dong, Lei Lu, Zhenglun Kong, Zhengang Li, Ming Lin, Chao Wu, Yanzhi Wang

Recent works show that 8-bit or lower weight quantization is feasible with minimal impact on end-to-end task performance, while the activation is still not quantized.

Language Modelling Quantization

Paper
Add Code

SupeRBNN: Randomized Binary Neural Network Using Adiabatic Superconductor Josephson Devices

no code implementations • 21 Sep 2023 • Zhengang Li, Geng Yuan, Tomoharu Yamauchi, Zabihi Masoud, Yanyue Xie, Peiyan Dong, Xulong Tang, Nobuyuki Yoshikawa, Devesh Tiwari, Yanzhi Wang, Olivia Chen

Specifically, we investigate the randomized behavior of the AQFP devices and analyze the impact of crossbar size on current attenuation, subsequently formulating the current amplitude into the values suitable for use in BNN computation.

Paper
Add Code

SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds

no code implementations • NeurIPS 2023 • Yanyu Li, Huan Wang, Qing Jin, Ju Hu, Pavlo Chemerys, Yun Fu, Yanzhi Wang, Sergey Tulyakov, Jian Ren

We achieve so by introducing efficient network architecture and improving step distillation.

Denoising

Paper
Add Code

DualHSIC: HSIC-Bottleneck and Alignment for Continual Learning

no code implementations • 30 Apr 2023 • Zifeng Wang, Zheng Zhan, Yifan Gong, Yucai Shao, Stratis Ioannidis, Yanzhi Wang, Jennifer Dy

Rehearsal-based approaches are a mainstay of continual learning (CL).

Continual Learning

Paper
Add Code

Can Adversarial Examples Be Parsed to Reveal Victim Model Information?

1 code implementation • 13 Mar 2023 • Yuguang Yao, Jiancheng Liu, Yifan Gong, Xiaoming Liu, Yanzhi Wang, Xue Lin, Sijia Liu

We call this 'model parsing of adversarial attacks' - a task to uncover 'arcana' in terms of the concealed VM information in attacks.

Adversarial Attack

Paper
Code

DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network

1 code implementation • CVPR 2023 • Xuan Shen, Yaohua Wang, Ming Lin, Yilun Huang, Hao Tang, Xiuyu Sun, Yanzhi Wang

To this end, a novel framework termed Mathematical Architecture Design for Deep CNN (DeepMAD) is proposed to design high-performance CNN models in a principled way.

Ranked #1 on Neural Architecture Search on ImageNet

Image Classification Neural Architecture Search

344

Paper
Code

Pruning Parameterization With Bi-Level Optimization for Efficient Semantic Segmentation on the Edge

no code implementations • CVPR 2023 • Changdi Yang, Pu Zhao, Yanyu Li, Wei Niu, Jiexiong Guan, Hao Tang, Minghai Qin, Bin Ren, Xue Lin, Yanzhi Wang

With the ever-increasing popularity of edge devices, it is necessary to implement real-time segmentation on the edge for autonomous driving and many other applications.

Autonomous Driving Segmentation +1

Paper
Add Code

Rethinking Vision Transformers for MobileNet Size and Speed

6 code implementations • ICCV 2023 • Yanyu Li, Ju Hu, Yang Wen, Georgios Evangelidis, Kamyar Salahi, Yanzhi Wang, Sergey Tulyakov, Jian Ren

With the success of Vision Transformers (ViTs) in computer vision tasks, recent arts try to optimize the performance and complexity of ViTs to enable efficient deployment on mobile devices.

29,758

Paper
Code

All-in-One: A Highly Representative DNN Pruning Framework for Edge Devices with Dynamic Power Management

no code implementations • 9 Dec 2022 • Yifan Gong, Zheng Zhan, Pu Zhao, Yushu Wu, Chao Wu, Caiwen Ding, Weiwen Jiang, Minghai Qin, Yanzhi Wang

By re-configuring the model to the corresponding pruning ratio for a specific execution frequency (and voltage), we are able to achieve stable inference speed, i. e., keeping the difference in speed performance under various execution frequencies as small as possible.

Management

Paper
Add Code

Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors

1 code implementation • 22 Nov 2022 • Sizhe Chen, Geng Yuan, Xinwen Cheng, Yifan Gong, Minghai Qin, Yanzhi Wang, Xiaolin Huang

In this paper, we uncover them by model checkpoints' gradients, forming the proposed self-ensemble protection (SEP), which is very effective because (1) learning on examples ignored during normal training tends to yield DNNs ignoring normal examples; (2) checkpoints' cross-model gradients are close to orthogonal, meaning that they are as diverse as DNNs with different architectures.

Paper
Code

You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model

1 code implementation • CVPR 2023 • Shengkun Tang, Yaqing Wang, Zhenglun Kong, Tianchi Zhang, Yao Li, Caiwen Ding, Yanzhi Wang, Yi Liang, Dongkuan Xu

To handle this challenge, we propose a novel early exiting strategy for unified visual language models, which allows dynamically skip the layers in encoder and decoder simultaneously in term of input layer-wise similarities with multiple times of early exiting, namely \textbf{MuE}.

Language Modelling

2,323

Paper
Code

Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training

1 code implementation • 19 Nov 2022 • Zhenglun Kong, Haoyu Ma, Geng Yuan, Mengshu Sun, Yanyue Xie, Peiyan Dong, Xin Meng, Xuan Shen, Hao Tang, Minghai Qin, Tianlong Chen, Xiaolong Ma, Xiaohui Xie, Zhangyang Wang, Yanzhi Wang

Vision transformers (ViTs) have recently obtained success in many applications, but their intensive computation and heavy memory usage at both training and inference time limit their generalization.

Paper
Code

HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers

no code implementations • 15 Nov 2022 • Peiyan Dong, Mengshu Sun, Alec Lu, Yanyue Xie, Kenneth Liu, Zhenglun Kong, Xin Meng, Zhengang Li, Xue Lin, Zhenman Fang, Yanzhi Wang

While vision transformers (ViTs) have continuously achieved new milestones in the field of computer vision, their sophisticated network architectures with high computation and memory costs have impeded their deployment on resource-limited edge devices.

Quantization

Paper
Add Code

Data Level Lottery Ticket Hypothesis for Vision Transformers

1 code implementation • 2 Nov 2022 • Xuan Shen, Zhenglun Kong, Minghai Qin, Peiyan Dong, Geng Yuan, Xin Meng, Hao Tang, Xiaolong Ma, Yanzhi Wang

That is, there exists a subset of input image patches such that a ViT can be trained from scratch by using only this subset of patches and achieve similar accuracy to the ViTs trained by using all image patches.

Analogical Similarity Informativeness

Paper
Code

Pruning Adversarially Robust Neural Networks without Adversarial Examples

1 code implementation • 9 Oct 2022 • Tong Jian, Zifeng Wang, Yanzhi Wang, Jennifer Dy, Stratis Ioannidis

Adversarial pruning compresses models while preserving robustness.

Adversarial Robustness

Paper
Code

Advancing Model Pruning via Bi-level Optimization

1 code implementation • 8 Oct 2022 • Yihua Zhang, Yuguang Yao, Parikshit Ram, Pu Zhao, Tianlong Chen, Mingyi Hong, Yanzhi Wang, Sijia Liu

To reduce the computation overhead, various efficient 'one-shot' pruning methods have been developed, but these schemes are usually unable to find winning tickets as good as IMP.

139

Paper
Code

Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse Training

1 code implementation • 22 Sep 2022 • Geng Yuan, Yanyu Li, Sheng Li, Zhenglun Kong, Sergey Tulyakov, Xulong Tang, Yanzhi Wang, Jian Ren

Therefore, we analyze the feasibility and potentiality of using the layer freezing technique in sparse training and find it has the potential to save considerable training costs.

Paper
Code

SparCL: Sparse Continual Learning on the Edge

1 code implementation • 20 Sep 2022 • Zifeng Wang, Zheng Zhan, Yifan Gong, Geng Yuan, Wei Niu, Tong Jian, Bin Ren, Stratis Ioannidis, Yanzhi Wang, Jennifer Dy

SparCL achieves both training acceleration and accuracy preservation through the synergy of three aspects: weight sparsity, data efficiency, and gradient sparsity.

Continual Learning

Paper
Code

PIM-QAT: Neural Network Quantization for Processing-In-Memory (PIM) Systems

no code implementations • 18 Sep 2022 • Qing Jin, Zhiyu Chen, Jian Ren, Yanyu Li, Yanzhi Wang, Kaiyuan Yang

In this paper, we propose a method for training quantized networks to incorporate PIM quantization, which is ubiquitous to all PIM systems.

Quantization

Paper
Add Code

StereoVoxelNet: Real-Time Obstacle Detection Based on Occupancy Voxels from a Stereo Camera Using Deep Neural Networks

no code implementations • 18 Sep 2022 • Hongyu Li, Zhengang Li, Neset Unver Akmandor, Huaizu Jiang, Yanzhi Wang, Taskin Padir

Obstacle detection is a safety-critical problem in robot navigation, where stereo matching is a popular vision-based approach.

Autonomous Navigation Robot Navigation +1

Paper
Add Code

Understanding Time Variations of DNN Inference in Autonomous Driving

no code implementations • 12 Sep 2022 • Liangkai Liu, Yanzhi Wang, Weisong Shi

Understanding the time variations of the DNN inference becomes a fundamental challenge in real-time scheduling for autonomous driving.

Autonomous Driving Scheduling

Paper
Add Code

Survey: Exploiting Data Redundancy for Optimization of Deep Learning

no code implementations • 29 Aug 2022 • Jou-An Chen, Wei Niu, Bin Ren, Yanzhi Wang, Xipeng Shen

It surveys hundreds of recent papers on the topic, introduces a novel taxonomy to put the various techniques into a single categorization framework, offers a comprehensive description of the main methods used for exploiting data redundancy in improving multiple kinds of DNNs on data, and points out a set of research opportunities for future to explore.

Paper
Add Code

Compiler-Aware Neural Architecture Search for On-Mobile Real-time Super-Resolution

1 code implementation • 25 Jul 2022 • Yushu Wu, Yifan Gong, Pu Zhao, Yanyu Li, Zheng Zhan, Wei Niu, Hao Tang, Minghai Qin, Bin Ren, Yanzhi Wang

Instead of measuring the speed on mobile devices at each iteration during the search process, a speed model incorporated with compiler optimizations is leveraged to predict the inference latency of the SR block with various width configurations for faster convergence.

Neural Architecture Search SSIM +1

Paper
Code

Continual Few-Shot Learning with Adversarial Class Storage

no code implementations • 10 Jul 2022 • Kun Wu, Chengxiang Yin, Jian Tang, Zhiyuan Xu, Yanzhi Wang, Dejun Yang

In this paper, we define a new problem called continual few-shot learning, in which tasks arrive sequentially and each task is associated with a few training samples.

continual few-shot learning Few-Shot Learning +1

Paper
Add Code

Quantum Neural Network Compression

no code implementations • 4 Jul 2022 • Zhirui Hu, Peiyan Dong, Zhepeng Wang, Youzuo Lin, Yanzhi Wang, Weiwen Jiang

Model compression, such as pruning and quantization, has been widely applied to optimize neural networks on resource-limited classical devices.

Neural Network Compression Quantization

Paper
Add Code

CoCoPIE XGen: A Full-Stack AI-Oriented Optimizing Framework

no code implementations • 21 Jun 2022 • Xiaofeng Li, Bin Ren, Xipeng Shen, Yanzhi Wang

There is a growing demand for shifting the delivery of AI capability from data centers on the cloud to edge or end devices, exemplified by the fast emerging real-time AI-based apps running on smartphones, AR/VR devices, autonomous vehicles, and various IoT devices.

Autonomous Vehicles

Paper
Add Code

Understanding EFL Student Idea Generation Strategies for Creative Writing with NLG Tools

no code implementations • 4 Jun 2022 • David James Woo, Yanzhi Wang, Hengky Susanto, Kai Guo

This study explores strategies adopted by EFL students when searching for ideas using NLG tools, evaluating ideas generated by NLG tools and selecting NLG tools for ideas generation.

Text Generation

Paper
Add Code

Pruning-as-Search: Efficient Neural Architecture Search via Channel Pruning and Structural Reparameterization

1 code implementation • 2 Jun 2022 • Yanyu Li, Pu Zhao, Geng Yuan, Xue Lin, Yanzhi Wang, Xin Chen

By combining the structural reparameterization and PaS, we successfully searched out a new family of VGG-like and lightweight networks, which enable the flexibility of arbitrary width with respect to each layer instead of each stage.

Instance Segmentation Network Pruning +2

Paper
Code

EfficientFormer: Vision Transformers at MobileNet Speed

10 code implementations • 2 Jun 2022 • Yanyu Li, Geng Yuan, Yang Wen, Ju Hu, Georgios Evangelidis, Sergey Tulyakov, Yanzhi Wang, Jian Ren

Our work proves that properly designed transformers can reach extremely low latency on mobile devices while maintaining high performance.

29,758

Paper
Code

Real-Time Portrait Stylization on the Edge

no code implementations • 2 Jun 2022 • Yanyu Li, Xuan Shen, Geng Yuan, Jiexiong Guan, Wei Niu, Hao Tang, Bin Ren, Yanzhi Wang

In this work we demonstrate real-time portrait stylization, specifically, translating self-portrait into cartoon or anime style on mobile devices.

Paper
Add Code

Neural Network-based OFDM Receiver for Resource Constrained IoT Devices

no code implementations • 12 May 2022 • Nasim Soltani, Hai Cheng, Mauro Belgiovine, Yanyu Li, Haoqing Li, Bahar Azari, Salvatore D'Oro, Tales Imbiriba, Tommaso Melodia, Pau Closas, Yanzhi Wang, Deniz Erdogmus, Kaushik Chowdhury

Here, ML blocks replace the individual processing blocks of an OFDM receiver, and we specifically describe this swapping for the legacy channel estimation, symbol demapping, and decoding blocks with Neural Networks (NNs).

Quantization

Paper
Add Code

Deep neural network goes lighter: A case study of deep compression techniques on automatic RF modulation recognition for Beyond 5G networks

no code implementations • 9 Apr 2022 • Anu Jagannath, Jithin Jagannath, Yanzhi Wang, Tommaso Melodia

Automatic RF modulation recognition is a primary signal intelligence (SIGINT) technique that serves as a physical layer authentication enabler and automated signal processing scheme for the beyond 5G and military networks.

Paper
Add Code

F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

1 code implementation • ICLR 2022 • Qing Jin, Jian Ren, Richard Zhuang, Sumant Hanumante, Zhengang Li, Zhiyu Chen, Yanzhi Wang, Kaiyuan Yang, Sergey Tulyakov

Our approach achieves comparable and better performance, when compared not only to existing quantization techniques with INT32 multiplication or floating-point arithmetic, but also to the full-precision counterparts, achieving state-of-the-art performance.

Quantization

Paper
Code

Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets

1 code implementation • 9 Feb 2022 • Tianlong Chen, Xuxi Chen, Xiaolong Ma, Yanzhi Wang, Zhangyang Wang

The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse subnetworks (i. e., winning tickets) that can be trained in isolation to match full accuracy.

Paper
Code

AirNN: Neural Networks with Over-the-Air Convolution via Reconfigurable Intelligent Surfaces

no code implementations • 7 Feb 2022 • Sara Garcia Sanchez, Guillem Reus Muns, Carlos Bocanegra, Yanyu Li, Ufuk Muncuk, Yousof Naderi, Yanzhi Wang, Stratis Ioannidis, Kaushik R. Chowdhury

In this paper, we design and implement the first-of-its-kind over-the-air convolution and demonstrate it for inference tasks in a convolutional neural network (CNN).

Paper
Add Code

VAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit Vision Transformer

no code implementations • 17 Jan 2022 • Mengshu Sun, Haoyu Ma, Guoliang Kang, Yifan Jiang, Tianlong Chen, Xiaolong Ma, Zhangyang Wang, Yanzhi Wang

To the best of our knowledge, this is the first time quantization has been incorporated into ViT acceleration on FPGAs with the help of a fully automatic framework to guide the quantization strategy on the software side and the accelerator implementations on the hardware side given the target frame rate.

Quantization

Paper
Add Code

SPViT: Enabling Faster Vision Transformers via Soft Token Pruning

1 code implementation • 27 Dec 2021 • Zhenglun Kong, Peiyan Dong, Xiaolong Ma, Xin Meng, Mengshu Sun, Wei Niu, Xuan Shen, Geng Yuan, Bin Ren, Minghai Qin, Hao Tang, Yanzhi Wang

Moreover, our framework can guarantee the identified model to meet resource specifications of mobile devices and FPGA, and even achieve the real-time execution of DeiT-T on mobile platforms.

Ranked #4 on Efficient ViTs on ImageNet-1K (with DeiT-S)

Efficient ViTs Model Compression

Paper
Code

Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting

no code implementations • 21 Dec 2021 • Minghai Qin, Tianyun Zhang, Fei Sun, Yen-Kuang Chen, Makan Fardad, Yanzhi Wang, Yuan Xie

Deep neural networks (DNNs) have shown to provide superb performance in many real life applications, but their large computation cost and storage requirement have prevented them from being deployed to many edge and internet-of-things (IoT) devices.

Graph Attention

Paper
Add Code

Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration

no code implementations • 22 Nov 2021 • Yifan Gong, Geng Yuan, Zheng Zhan, Wei Niu, Zhengang Li, Pu Zhao, Yuxuan Cai, Sijia Liu, Bin Ren, Xue Lin, Xulong Tang, Yanzhi Wang

Weight pruning is an effective model compression technique to tackle the challenges of achieving real-time deep neural network (DNN) inference on mobile devices.

Model Compression

Paper
Add Code

ILMPQ : An Intra-Layer Multi-Precision Deep Neural Network Quantization framework for FPGA

no code implementations • 30 Oct 2021 • Sung-En Chang, Yanyu Li, Mengshu Sun, Yanzhi Wang, Xue Lin

Our proposed ILMPQ DNN quantization framework achieves 70. 73 Top1 accuracy in ResNet-18 on the ImageNet dataset.

Edge-computing Model Compression +1

Paper
Add Code

RMSMP: A Novel Deep Neural Network Quantization Framework with Row-wise Mixed Schemes and Multiple Precisions

no code implementations • ICCV 2021 • Sung-En Chang, Yanyu Li, Mengshu Sun, Weiwen Jiang, Sijia Liu, Yanzhi Wang, Xue Lin

Specifically, this is the first effort to assign mixed quantization schemes and multiple precisions within layers -- among rows of the DNN weight matrix, for simplified operations in hardware inference, while preserving accuracy.

Image Classification Quantization

Paper
Add Code

ScaleCert: Scalable Certified Defense against Adversarial Patches with Sparse Superficial Layers

no code implementations • NeurIPS 2021 • Husheng Han, Kaidi Xu, Xing Hu, Xiaobing Chen, Ling Liang, Zidong Du, Qi Guo, Yanzhi Wang, Yunji Chen

Our experimental results show that the certified accuracy is increased from 36. 3% (the state-of-the-art certified detection) to 60. 4% on the ImageNet dataset, largely pushing the certified defenses for practical use.

Paper
Add Code

MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge

1 code implementation • NeurIPS 2021 • Geng Yuan, Xiaolong Ma, Wei Niu, Zhengang Li, Zhenglun Kong, Ning Liu, Yifan Gong, Zheng Zhan, Chaoyang He, Qing Jin, Siyue Wang, Minghai Qin, Bin Ren, Yanzhi Wang, Sijia Liu, Xue Lin

Systematical evaluation on accuracy, training speed, and memory footprint are conducted, where the proposed MEST framework consistently outperforms representative SOTA works.

Paper
Code

Enabling Level-4 Autonomous Driving on a Single $1k Off-the-Shelf Card

no code implementations • 12 Oct 2021 • Hsin-Hsuan Sung, Yuanchao Xu, Jiexiong Guan, Wei Niu, Shaoshan Liu, Bin Ren, Yanzhi Wang, Xipeng Shen

Autonomous driving is of great interest in both research and industry.

Autonomous Driving

Paper
Add Code

HFSP: A Hardware-friendly Soft Pruning Framework for Vision Transformers

no code implementations • 29 Sep 2021 • Zhenglun Kong, Peiyan Dong, Xiaolong Ma, Xin Meng, Mengshu Sun, Wei Niu, Bin Ren, Minghai Qin, Hao Tang, Yanzhi Wang

Recently, Vision Transformer (ViT) has continuously established new milestones in the computer vision field, while the high computation and memory cost makes its propagation in industrial production difficult.

Image Classification Model Compression

Paper
Add Code

Lottery Tickets can have Structural Sparsity

no code implementations • 29 Sep 2021 • Tianlong Chen, Xuxi Chen, Xiaolong Ma, Yanzhi Wang, Zhangyang Wang

The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse subnetworks (i. e., $\textit{winning tickets}$) that can be trained in isolation to match full accuracy.

Paper
Add Code

Elastic Significant Bit Quantization and Acceleration for Deep Neural Networks

no code implementations • 8 Sep 2021 • Cheng Gong, Ye Lu, Kunpeng Xie, Zongming Jin, Tao Li, Yanzhi Wang

We implement ESB as an accelerator and quantitatively evaluate its efficiency on FPGAs.

Quantization

Paper
Add Code

DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion

no code implementations • 30 Aug 2021 • Wei Niu, Jiexiong Guan, Yanzhi Wang, Gagan Agrawal, Bin Ren

Deep Neural Networks (DNNs) have emerged as the core enabler of many major applications on mobile devices.

Code Generation

Paper
Add Code

GRIM: A General, Real-Time Deep Learning Inference Framework for Mobile Devices based on Fine-Grained Structured Weight Sparsity

no code implementations • 25 Aug 2021 • Wei Niu, Zhengang Li, Xiaolong Ma, Peiyan Dong, Gang Zhou, Xuehai Qian, Xue Lin, Yanzhi Wang, Bin Ren

It necessitates the sparse model inference via weight pruning, i. e., DNN weight sparsity, and it is desirable to design a new DNN weight sparsity scheme that can facilitate real-time inference on mobile devices while preserving a high sparse model accuracy.

Code Generation Compiler Optimization

Paper
Add Code

Achieving on-Mobile Real-Time Super-Resolution with Neural Architecture and Pruning Search

no code implementations • ICCV 2021 • Zheng Zhan, Yifan Gong, Pu Zhao, Geng Yuan, Wei Niu, Yushu Wu, Tianyun Zhang, Malith Jayaweera, David Kaeli, Bin Ren, Xue Lin, Yanzhi Wang

Though recent years have witnessed remarkable progress in single image super-resolution (SISR) tasks with the prosperous development of deep neural networks (DNNs), the deep learning methods are confronted with the computation and memory consumption issues in practice, especially for resource-limited platforms such as mobile devices.

Image Super-Resolution Neural Architecture Search +1

Paper
Add Code

CAP-RAM: A Charge-Domain In-Memory Computing 6T-SRAM for Accurate and Precision-Programmable CNN Inference

no code implementations • 6 Jul 2021 • Zhiyu Chen, Zhanghao Yu, Qing Jin, Yan He, Jingyu Wang, Sheng Lin, Dai Li, Yanzhi Wang, Kaiyuan Yang

A compact, accurate, and bitwidth-programmable in-memory computing (IMC) static random-access memory (SRAM) macro, named CAP-RAM, is presented for energy-efficient convolutional neural network (CNN) inference.

Paper
Add Code

Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win the Jackpot?

2 code implementations • NeurIPS 2021 • Xiaolong Ma, Geng Yuan, Xuan Shen, Tianlong Chen, Xuxi Chen, Xiaohan Chen, Ning Liu, Minghai Qin, Sijia Liu, Zhangyang Wang, Yanzhi Wang

Based on our analysis, we summarize a guideline for parameter settings in regards of specific architecture characteristics, which we hope to catalyze the research progress on the topic of lottery ticket hypothesis.

139

Paper
Code

Achieving Real-Time Object Detection on MobileDevices with Neural Pruning Search

no code implementations • 28 Jun 2021 • Pu Zhao, Wei Niu, Geng Yuan, Yuxuan Cai, Bin Ren, Yanzhi Wang, Xue Lin

Object detection plays an important role in self-driving cars for security development.

3D Object Detection Compiler Optimization +4

Paper
Add Code

Effective Model Sparsification by Scheduled Grow-and-Prune Methods

1 code implementation • ICLR 2022 • Xiaolong Ma, Minghai Qin, Fei Sun, Zejiang Hou, Kun Yuan, Yi Xu, Yanzhi Wang, Yen-Kuang Chen, Rong Jin, Yuan Xie

It addresses the shortcomings of the previous works by repeatedly growing a subset of layers to dense and then pruning them back to sparse after some training.

Image Classification

Paper
Code

Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI

no code implementations • 16 Jun 2021 • Geng Yuan, Zhiheng Liao, Xiaolong Ma, Yuxuan Cai, Zhenglun Kong, Xuan Shen, Jingyan Fu, Zhengang Li, Chengming Zhang, Hongwu Peng, Ning Liu, Ao Ren, Jinhui Wang, Yanzhi Wang

More importantly, our method does not require extra hardware cost compared to the traditional two-column mapping scheme.

Paper
Add Code

FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator

no code implementations • 16 Jun 2021 • Geng Yuan, Payman Behnam, Zhengang Li, Ali Shafiee, Sheng Lin, Xiaolong Ma, Hang Liu, Xuehai Qian, Mahdi Nazm Bojnordi, Yanzhi Wang, Caiwen Ding

With weights stored in the ReRAM crossbar cells as conductance, when the input vector is applied to word lines, the matrix-vector multiplication results can be generated as the current in bit lines.

Paper
Add Code

Efficient Micro-Structured Weight Unification and Pruning for Neural Network Compression

no code implementations • 15 Jun 2021 • Sheng Lin, Wei Jiang, Wei Wang, Kaidi Xu, Yanzhi Wang, Shan Liu, Songnan Li

Compressing Deep Neural Network (DNN) models to alleviate the storage and computation requirements is essential for practical applications, especially for resource limited devices.

Neural Network Compression

Paper
Add Code

Towards Fast and Accurate Multi-Person Pose Estimation on Mobile Devices

no code implementations • 6 Jun 2021 • Xuan Shen, Geng Yuan, Wei Niu, Xiaolong Ma, Jiexiong Guan, Zhengang Li, Bin Ren, Yanzhi Wang

The rapid development of autonomous driving, abnormal behavior detection, and behavior recognition makes an increasing demand for multi-person pose estimation-based applications, especially on mobile platforms.

Autonomous Driving Multi-Person Pose Estimation

Paper
Add Code

A Compression-Compilation Framework for On-mobile Real-time BERT Applications

no code implementations • 30 May 2021 • Wei Niu, Zhenglun Kong, Geng Yuan, Weiwen Jiang, Jiexiong Guan, Caiwen Ding, Pu Zhao, Sijia Liu, Bin Ren, Yanzhi Wang

In this paper, we propose a compression-compilation co-design framework that can guarantee the identified model to meet both resource and real-time specifications of mobile devices.

Question Answering Text Generation

Paper
Add Code

Teachers Do More Than Teach: Compressing Image-to-Image Models

1 code implementation • CVPR 2021 • Qing Jin, Jian Ren, Oliver J. Woodford, Jiazhuo Wang, Geng Yuan, Yanzhi Wang, Sergey Tulyakov

In this work, we aim to address these issues by introducing a teacher network that provides a search space in which efficient network architectures can be found, in addition to performing knowledge distillation.

Knowledge Distillation

175

Paper
Code

Lottery Ticket Preserves Weight Correlation: Is It Desirable or Not?

no code implementations • 19 Feb 2021 • Ning Liu, Geng Yuan, Zhengping Che, Xuan Shen, Xiaolong Ma, Qing Jin, Jian Ren, Jian Tang, Sijia Liu, Yanzhi Wang

In deep model compression, the recent finding "Lottery Ticket Hypothesis" (LTH) (Frankle & Carbin, 2018) pointed out that there could exist a winning ticket (i. e., a properly pruned sub-network together with original weight initialization) that can achieve competitive performance than the original dense network.

Model Compression

Paper
Add Code

Improving Neural Network Efficiency via Post-Training Quantization With Adaptive Floating-Point

1 code implementation • ICCV 2021 • Fangxin Liu, Wenbo Zhao, Zhezhi He, Yanzhi Wang, Zongwu Wang, Changzhi Dai, Xiaoyao Liang, Li Jiang

Model quantization has emerged as a mandatory technique for efficient inference with advanced Deep Neural Networks (DNN).

Model Compression Quantization

Paper
Code

Achieving Real-Time LiDAR 3D Object Detection on a Mobile Device

no code implementations • 26 Dec 2020 • Pu Zhao, Wei Niu, Geng Yuan, Yuxuan Cai, Hsin-Hsuan Sung, Sijia Liu, Xipeng Shen, Bin Ren, Yanzhi Wang, Xue Lin

3D object detection is an important task, especially in the autonomous driving application domain.

3D Object Detection Autonomous Driving +5

Paper
Add Code

Learn-Prune-Share for Lifelong Learning

1 code implementation • 13 Dec 2020 • Zifeng Wang, Tong Jian, Kaushik Chowdhury, Yanzhi Wang, Jennifer Dy, Stratis Ioannidis

In lifelong learning, we wish to maintain and update a model (e. g., a neural network classifier) in the presence of new classification tasks that arrive sequentially.

Paper
Code

Mix and Match: A Novel FPGA-Centric Deep Neural Network Quantization Framework

no code implementations • 8 Dec 2020 • Sung-En Chang, Yanyu Li, Mengshu Sun, Runbin Shi, Hayden K. -H. So, Xuehai Qian, Yanzhi Wang, Xue Lin

Unlike existing methods that use the same quantization scheme for all weights, we propose the first solution that applies different quantization schemes for different rows of the weight matrix.

Edge-computing Model Compression +1

Paper
Add Code

NPAS: A Compiler-aware Framework of Unified Network Pruning and Architecture Search for Beyond Real-Time Mobile Acceleration

no code implementations • CVPR 2021 • Zhengang Li, Geng Yuan, Wei Niu, Pu Zhao, Yanyu Li, Yuxuan Cai, Xuan Shen, Zheng Zhan, Zhenglun Kong, Qing Jin, Zhiyu Chen, Sijia Liu, Kaiyuan Yang, Bin Ren, Yanzhi Wang, Xue Lin

With the increasing demand to efficiently deploy DNNs on mobile edge devices, it becomes much more important to reduce unnecessary computation and increase the execution speed.

Bayesian Optimization Code Generation +2

Paper
Add Code

ClickTrain: Efficient and Accurate End-to-End Deep Learning Training via Fine-Grained Architecture-Preserving Pruning

no code implementations • 20 Nov 2020 • Chengming Zhang, Geng Yuan, Wei Niu, Jiannan Tian, Sian Jin, Donglin Zhuang, Zhe Jiang, Yanzhi Wang, Bin Ren, Shuaiwen Leon Song, Dingwen Tao

Moreover, compared with the state-of-the-art pruning-during-training approach, ClickTrain provides significant improvements both accuracy and compression ratio on the tested CNN models and datasets, under similar limited training time.

Paper
Add Code

DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator Search

no code implementations • 4 Nov 2020 • Yushuo Guan, Ning Liu, Pengyu Zhao, Zhengping Che, Kaigui Bian, Yanzhi Wang, Jian Tang

The convolutional neural network has achieved great success in fulfilling computer vision tasks despite large computation overhead against efficient deployment.

Neural Architecture Search

Paper
Add Code

Simultaneous Relevance and Diversity: A New Recommendation Inference Approach

no code implementations • 27 Sep 2020 • Yifang Liu, Zhentao Xu, Qiyuan An, Yang Yi, Yanzhi Wang, Trevor Hastie

Heterogeneous inference achieves divergent relevance, where relevance and diversity support each other as two collaborating objectives in one recommendation model, and where recommendation diversity is an inherent outcome of the relevance inference process.

Collaborative Filtering Recommendation Systems

Paper
Add Code

MSP: An FPGA-Specific Mixed-Scheme, Multi-Precision Deep Neural Network Quantization Framework

no code implementations • 16 Sep 2020 • Sung-En Chang, Yanyu Li, Mengshu Sun, Weiwen Jiang, Runbin Shi, Xue Lin, Yanzhi Wang

To tackle the limited computing and storage resources in edge devices, model compression techniques have been widely used to trim deep neural network (DNN) models for on-device inference execution.

Edge-computing Image Denoising +2

Paper
Add Code

Real-Time Execution of Large-scale Language Models on Mobile

no code implementations • 15 Sep 2020 • Wei Niu, Zhenglun Kong, Geng Yuan, Weiwen Jiang, Jiexiong Guan, Caiwen Ding, Pu Zhao, Sijia Liu, Bin Ren, Yanzhi Wang

Our framework can guarantee the identified model to meet both resource and real-time specifications of mobile devices, thus achieving real-time execution of large transformer-based models like BERT variants.

Edge-computing

Paper
Add Code

YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design

3 code implementations • 12 Sep 2020 • Yuxuan Cai, Hongjia Li, Geng Yuan, Wei Niu, Yanyu Li, Xulong Tang, Bin Ren, Yanzhi Wang

In this work, we propose YOLObile framework, a real-time object detection on mobile devices via compression-compilation co-design.

Computational Efficiency Object +2

363

Paper
Code

ESMFL: Efficient and Secure Models for Federated Learning

no code implementations • 3 Sep 2020 • Sheng Lin, Chenghong Wang, Hongjia Li, Jieren Deng, Yanzhi Wang, Caiwen Ding

Nowadays, Deep Neural Networks are widely applied to various domains.

Federated Learning Privacy Preserving

Paper
Add Code

AntiDote: Attention-based Dynamic Optimization for Neural Network Runtime Efficiency

no code implementations • 14 Aug 2020 • Fuxun Yu, ChenChen Liu, Di Wang, Yanzhi Wang, Xiang Chen

Based on the neural network attention mechanism, we propose a comprehensive dynamic optimization framework including (1) testing-phase channel and column feature map pruning, as well as (2) training-phase optimization by targeted dropout.

Paper
Add Code

One for Many: Transfer Learning for Building HVAC Control

no code implementations • 9 Aug 2020 • Shichao Xu, Yi-Xuan Wang, Yanzhi Wang, Zheng O'Neill, Qi Zhu

Traditional HVAC control methods are typically based on creating explicit physical models for building thermal dynamics, which often require significant effort to develop and are difficult to achieve sufficient accuracy and efficiency for runtime building control and scalability for field implementations.

Transfer Learning

Paper
Add Code

RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks on Mobile Devices

no code implementations • 20 Jul 2020 • Wei Niu, Mengshu Sun, Zhengang Li, Jou-An Chen, Jiexiong Guan, Xipeng Shen, Yanzhi Wang, Sijia Liu, Xue Lin, Bin Ren

The vanilla sparsity removes whole kernel groups, while KGS sparsity is a more fine-grained structured sparsity that enjoys higher flexibility while exploiting full on-device parallelism.

Code Generation Model Compression

Paper
Add Code

Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization

no code implementations • 22 Apr 2020 • Wei Niu, Pu Zhao, Zheng Zhan, Xue Lin, Yanzhi Wang, Bin Ren

High-end mobile platforms rapidly serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications.

Compiler Optimization Style Transfer +1

Paper
Add Code

A Unified DNN Weight Compression Framework Using Reweighted Optimization Methods

no code implementations • 12 Apr 2020 • Tianyun Zhang, Xiaolong Ma, Zheng Zhan, Shanglin Zhou, Minghai Qin, Fei Sun, Yen-Kuang Chen, Caiwen Ding, Makan Fardad, Yanzhi Wang

To address the large model size and intensive computation requirement of deep neural networks (DNNs), weight pruning techniques have been proposed and generally fall into two categories, i. e., static regularization-based pruning and dynamic regularization-based pruning.

Paper
Add Code

CoCoPIE: Making Mobile AI Sweet As PIE --Compression-Compilation Co-Design Goes a Long Way

no code implementations • 14 Mar 2020 • Shaoshan Liu, Bin Ren, Xipeng Shen, Yanzhi Wang

Assuming hardware is the major constraint for enabling real-time mobile intelligence, the industry has mainly dedicated their efforts to developing specialized hardware accelerators for machine learning and inference.

Paper
Add Code

A Privacy-Preserving-Oriented DNN Pruning and Mobile Acceleration Framework

no code implementations • 13 Mar 2020 • Yifan Gong, Zheng Zhan, Zhengang Li, Wei Niu, Xiaolong Ma, Wenhao Wang, Bin Ren, Caiwen Ding, Xue Lin, Xiao-Lin Xu, Yanzhi Wang

Weight pruning of deep neural networks (DNNs) has been proposed to satisfy the limited storage and computing capability of mobile edge devices.

Model Compression Privacy Preserving

Paper
Add Code

RTMobile: Beyond Real-Time Mobile Acceleration of RNNs for Speech Recognition

no code implementations • 19 Feb 2020 • Peiyan Dong, Siyue Wang, Wei Niu, Chengming Zhang, Sheng Lin, Zhengang Li, Yifan Gong, Bin Ren, Xue Lin, Yanzhi Wang, Dingwen Tao

Recurrent neural networks (RNNs) based automatic speech recognition has nowadays become prevalent on mobile devices such as smart phones.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Efficient Training of Deep Convolutional Neural Networks by Augmentation in Embedding Space

no code implementations • 12 Feb 2020 • Mohammad Saeed Abrishami, Amir Erfan Eshratifar, David Eigen, Yanzhi Wang, Shahin Nazarian, Massoud Pedram

However, fine-tuning a transfer model with data augmentation in the raw input space has a high computational cost to run the full network for every augmented input.

Data Augmentation Transfer Learning

Paper
Add Code

PCNN: Pattern-based Fine-Grained Regular Pruning towards Optimizing CNN Accelerators

no code implementations • 11 Feb 2020 • Zhanhong Tan, Jiebo Song, Xiaolong Ma, Sia-Huat Tan, Hongyang Chen, Yuanqing Miao, Yi-Fu Wu, Shaokai Ye, Yanzhi Wang, Dehui Li, Kaisheng Ma

Weight pruning is a powerful technique to realize model compression.

Model Compression

Paper
Add Code

SS-Auto: A Single-Shot, Automatic Structured Weight Pruning Framework of DNNs with Ultra-High Efficiency

no code implementations • 23 Jan 2020 • Zhengang Li, Yifan Gong, Xiaolong Ma, Sijia Liu, Mengshu Sun, Zheng Zhan, Zhenglun Kong, Geng Yuan, Yanzhi Wang

Structured weight pruning is a representative model compression technique of DNNs for hardware efficiency and inference accelerations.

Model Compression

Paper
Add Code

BLK-REW: A Unified Block-based DNN Pruning Framework using Reweighted Regularization Method

no code implementations • 23 Jan 2020 • Xiaolong Ma, Zhengang Li, Yifan Gong, Tianyun Zhang, Wei Niu, Zheng Zhan, Pu Zhao, Jian Tang, Xue Lin, Bin Ren, Yanzhi Wang

Accelerating DNN execution on various resource-limited computing platforms has been a long-standing problem.

Paper
Add Code

An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices

no code implementations • ECCV 2020 • Xiaolong Ma, Wei Niu, Tianyun Zhang, Sijia Liu, Sheng Lin, Hongjia Li, Xiang Chen, Jian Tang, Kaisheng Ma, Bin Ren, Yanzhi Wang

Weight pruning has been widely acknowledged as a straightforward and effective method to eliminate redundancy in Deep Neural Networks (DNN), thereby achieving acceleration on various platforms.

Code Generation Compiler Optimization

Paper
Add Code

Embedding Compression with Isotropic Iterative Quantization

no code implementations • 11 Jan 2020 • Siyu Liao, Jie Chen, Yanzhi Wang, Qinru Qiu, Bo Yuan

Continuous representation of words is a standard component in deep learning-based NLP models.

Image Retrieval Quantization +1

Paper
Add Code

PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning

no code implementations • 1 Jan 2020 • Wei Niu, Xiaolong Ma, Sheng Lin, Shihao Wang, Xuehai Qian, Xue Lin, Yanzhi Wang, Bin Ren

Weight pruning of DNNs is proposed, but existing schemes represent two extremes in the design space: non-structured pruning is fine-grained, accurate, but not hardware friendly; structured pruning is coarse-grained, hardware-efficient, but with higher accuracy loss.

Code Generation Model Compression

Paper
Add Code

DARB: A Density-Aware Regular-Block Pruning for Deep Neural Networks

no code implementations • 19 Nov 2019 • Ao Ren, Tao Zhang, Yuhao Wang, Sheng Lin, Peiyan Dong, Yen-Kuang Chen, Yuan Xie, Yanzhi Wang

As a further optimization, we propose a density-adaptive regular-block (DARB) pruning that outperforms prior structured pruning work with high pruning ratio and decoding efficiency.

Model Compression Network Pruning

Paper
Add Code

Deep Compressed Pneumonia Detection for Low-Power Embedded Devices

no code implementations • 4 Nov 2019 • Hongjia Li, Sheng Lin, Ning Liu, Caiwen Ding, Yanzhi Wang

Deep neural networks (DNNs) have been expanded into medical fields and triggered the revolution of some medical applications by extracting complex features and achieving high accuracy and performance, etc.

Pneumonia Detection

Paper
Add Code

Adversarial T-shirt! Evading Person Detectors in A Physical World

1 code implementation • ECCV 2020 • Kaidi Xu, Gaoyuan Zhang, Sijia Liu, Quanfu Fan, Mengshu Sun, Hongge Chen, Pin-Yu Chen, Yanzhi Wang, Xue Lin

To the best of our knowledge, this is the first work that models the effect of deformation for designing physical adversarial examples with respect to-rigid objects such as T-shirts.

Paper
Code

SPEC2: SPECtral SParsE CNN Accelerator on FPGAs

no code implementations • 16 Oct 2019 • Yue Niu, Hanqing Zeng, Ajitesh Srivastava, Kartik Lakhotia, Rajgopal Kannan, Yanzhi Wang, Viktor Prasanna

On the other hand, weight pruning techniques address the redundancy in model parameters by converting dense convolutional kernels into sparse ones.

Paper
Add Code

REQ-YOLO: A Resource-Aware, Efficient Quantization Framework for Object Detection on FPGAs

no code implementations • 29 Sep 2019 • Caiwen Ding, Shuo Wang, Ning Liu, Kaidi Xu, Yanzhi Wang, Yun Liang

To achieve real-time, highly-efficient implementations on FPGA, we present the detailed hardware implementation of block circulant matrices on CONV layers and develop an efficient processing element (PE) structure supporting the heterogeneous weight quantization, CONV dataflow and pipelining techniques, design optimization, and a template-based automatic synthesis framework to optimally exploit hardware resource.

Model Compression object-detection +2

Paper
Add Code

Reweighted Proximal Pruning for Large-Scale Language Representation

no code implementations • 27 Sep 2019 • Fu-Ming Guo, Sijia Liu, Finlay S. Mungall, Xue Lin, Yanzhi Wang

Is it possible to compress these large-scale language representation models?

Natural Language Understanding Transfer Learning

Paper
Add Code

PCONV: The Missing but Desirable Sparsity in DNN Weight Pruning for Real-time Execution on Mobile Devices

no code implementations • 6 Sep 2019 • Xiaolong Ma, Fu-Ming Guo, Wei Niu, Xue Lin, Jian Tang, Kaisheng Ma, Bin Ren, Yanzhi Wang

Model compression techniques on Deep Neural Network (DNN) have been widely acknowledged as an effective way to achieve acceleration on a variety of platforms, and DNN weight pruning is a straightforward and effective method.

Model Compression

Paper
Add Code

An Ultra-Efficient Memristor-Based DNN Framework with Structured Weight Pruning and Quantization Using ADMM

no code implementations • 29 Aug 2019 • Geng Yuan, Xiaolong Ma, Caiwen Ding, Sheng Lin, Tianyun Zhang, Zeinab S. Jalali, Yilong Zhao, Li Jiang, Sucheta Soundarajan, Yanzhi Wang

Memristor-based weight pruning and weight quantization have been seperately investigated and proven effectiveness in reducing area and power consumption compared to the original DNN model.

Quantization

Paper
Add Code

Tiny but Accurate: A Pruned, Quantized and Optimized Memristor Crossbar Framework for Ultra Efficient DNN Implementation

no code implementations • 27 Aug 2019 • Xiaolong Ma, Geng Yuan, Sheng Lin, Caiwen Ding, Fuxun Yu, Tao Liu, Wujie Wen, Xiang Chen, Yanzhi Wang

To mitigate the challenges, the memristor crossbar array has emerged as an intrinsically suitable matrix computation and low-power acceleration framework for DNN applications.

Model Compression Quantization

Paper
Add Code

Protecting Neural Networks with Hierarchical Random Switching: Towards Better Robustness-Accuracy Trade-off for Stochastic Defenses

1 code implementation • 20 Aug 2019 • Xiao Wang, Siyue Wang, Pin-Yu Chen, Yanzhi Wang, Brian Kulis, Xue Lin, Peter Chin

However, one critical drawback of current defenses is that the robustness enhancement is at the cost of noticeable performance degradation on legitimate data, e. g., large drop in test accuracy.

Adversarial Robustness

Paper
Code

A Stochastic-Computing based Deep Learning Framework using Adiabatic Quantum-Flux-Parametron SuperconductingTechnology

no code implementations • 22 Jul 2019 • Ruizhe Cai, Ao Ren, Olivia Chen, Ning Liu, Caiwen Ding, Xuehai Qian, Jie Han, Wenhui Luo, Nobuyuki Yoshikawa, Yanzhi Wang

Further, the application of SC has been investigated in DNNs in prior work, and the suitability has been illustrated as SC is more compatible with approximate computations.

Paper
Add Code

AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates

no code implementations • 6 Jul 2019 • Ning Liu, Xiaolong Ma, Zhiyuan Xu, Yanzhi Wang, Jian Tang, Jieping Ye

This work proposes AutoCompress, an automatic structured pruning framework with the following key performance improvements: (i) effectively incorporate the combination of structured pruning schemes in the automatic process; (ii) adopt the state-of-art ADMM-based structured weight pruning as the core algorithm, and propose an innovative additional purification step for further weight reduction without accuracy loss; and (iii) develop effective heuristic search method enhanced by experience-based guided search, replacing the prior deep reinforcement learning technique which has underlying incompatibility with the target pruning problem.

Model Compression

Paper
Add Code

Non-Structured DNN Weight Pruning -- Is It Beneficial in Any Platform?

no code implementations • 3 Jul 2019 • Xiaolong Ma, Sheng Lin, Shaokai Ye, Zhezhi He, Linfeng Zhang, Geng Yuan, Sia Huat Tan, Zhengang Li, Deliang Fan, Xuehai Qian, Xue Lin, Kaisheng Ma, Yanzhi Wang

Based on the proposed comparison framework, with the same accuracy and quantization, the results show that non-structrued pruning is not competitive in terms of both storage and computation efficiency.

Model Compression Quantization

Paper
Add Code

Robust Sparse Regularization: Simultaneously Optimizing Neural Network Robustness and Compactness

no code implementations • 30 May 2019 • Adnan Siraj Rakin, Zhezhi He, Li Yang, Yanzhi Wang, Liqiang Wang, Deliang Fan

In this work, we show that shrinking the model size through proper weight pruning can even be helpful to improve the DNN robustness under adversarial attack.

Adversarial Attack

Paper
Add Code

Fault Sneaking Attack: a Stealthy Framework for Misleading Deep Neural Networks

no code implementations • 28 May 2019 • Pu Zhao, Siyue Wang, Cheng Gongye, Yanzhi Wang, Yunsi Fei, Xue Lin

Despite the great achievements of deep neural networks (DNNs), the vulnerability of state-of-the-art DNNs raises security concerns of DNNs in many application domains requiring high reliability. We propose the fault sneaking attack on DNNs, where the adversary aims to misclassify certain input images into any target labels by modifying the DNN parameters.

Overall - Test

Paper
Add Code

Brain-inspired reverse adversarial examples

no code implementations • 28 May 2019 • Shaokai Ye, Sia Huat Tan, Kaidi Xu, Yanzhi Wang, Chenglong Bao, Kaisheng Ma

On contrast, current state-of-the-art deep learning approaches heavily depend on the variety of training samples and the capacity of the network.

Quantization

Paper
Add Code

Interpreting and Evaluating Neural Network Robustness

no code implementations • 10 May 2019 • Fuxun Yu, Zhuwei Qin, Chenchen Liu, Liang Zhao, Yanzhi Wang, Xiang Chen

Recently, adversarial deception becomes one of the most considerable threats to deep neural networks.

Adversarial Attack

Paper
Add Code

Toward Extremely Low Bit and Lossless Accuracy in DNNs with Progressive ADMM

no code implementations • 2 May 2019 • Sheng Lin, Xiaolong Ma, Shaokai Ye, Geng Yuan, Kaisheng Ma, Yanzhi Wang

Weight quantization is one of the most important techniques of Deep Neural Networks (DNNs) model compression method.

Model Compression Quantization

Paper
Add Code

26ms Inference Time for ResNet-50: Towards Real-Time Execution of all DNNs on Smartphone

no code implementations • 2 May 2019 • Wei Niu, Xiaolong Ma, Yanzhi Wang, Bin Ren

With the rapid emergence of a spectrum of high-end mobile devices, many applications that required desktop-level computation capability formerly can now run on these devices without any problem.

Model Compression

Paper
Add Code

ResNet Can Be Pruned 60x: Introducing Network Purification and Unused Path Removal (P-RM) after Weight Pruning

no code implementations • 30 Apr 2019 • Xiaolong Ma, Geng Yuan, Sheng Lin, Zhengang Li, Hao Sun, Yanzhi Wang

The state-of-art DNN structures involve high computation and great demand for memory storage which pose intensive challenge on DNN framework resources.

Paper
Add Code

Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation

3 code implementations • CVPR 2019 • Hao Tang, Dan Xu, Nicu Sebe, Yanzhi Wang, Jason J. Corso, Yan Yan

In this paper, we propose a novel approach named Multi-Channel Attention SelectionGAN (SelectionGAN) that makes it possible to generate images of natural scenes in arbitrary viewpoints, based on an image of the scene and a novel semantic map.

Ranked #1 on Cross-View Image-to-Image Translation on Dayton (64×64) - aerial-to-ground

Bird View Synthesis Cross-View Image-to-Image Translation +1

458

Paper
Code

Machine Vision Guided 3D Medical Image Compression for Efficient Transmission and Accurate Segmentation in the Clouds

no code implementations • CVPR 2019 • Zihao Liu, Xiaowei Xu, Tao Liu, Qi Liu, Yanzhi Wang, Yiyu Shi, Wujie Wen, Meiping Huang, Haiyun Yuan, Jian Zhuang

In this paper we will use deep learning based medical image segmentation as a vehicle and demonstrate that interestingly, machine and human view the compression quality differently.

Image Compression Image Segmentation +3

Paper
Add Code

Adversarial Robustness vs Model Compression, or Both?

1 code implementation • 29 Mar 2019 • Shaokai Ye, Kaidi Xu, Sijia Liu, Jan-Henrik Lambrechts, huan zhang, Aojun Zhou, Kaisheng Ma, Yanzhi Wang, Xue Lin

Furthermore, this work studies two hypotheses about weight pruning in the conventional setting and finds that weight pruning is essential for reducing the network model size in the adversarial setting, training a small model from scratch even with inherited initialization from the large model cannot achieve both adversarial robustness and high standard accuracy.

Adversarial Robustness Model Compression +1

Paper
Code

Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates using ADMM

2 code implementations • 23 Mar 2019 • Shaokai Ye, Xiaoyu Feng, Tianyun Zhang, Xiaolong Ma, Sheng Lin, Zhengang Li, Kaidi Xu, Wujie Wen, Sijia Liu, Jian Tang, Makan Fardad, Xue Lin, Yongpan Liu, Yanzhi Wang

A recent work developed a systematic frame-work of DNN weight pruning using the advanced optimization technique ADMM (Alternating Direction Methods of Multipliers), achieving one of state-of-art in weight pruning results.

Model Compression Quantization

Paper
Code

CircConv: A Structured Convolution with Low Complexity

no code implementations • 28 Feb 2019 • Siyu Liao, Zhe Li, Liang Zhao, Qinru Qiu, Yanzhi Wang, Bo Yuan

Deep neural networks (DNNs), especially deep convolutional neural networks (CNNs), have emerged as the powerful technique in various machine learning applications.

Paper
Add Code

ADMM-NN: An Algorithm-Hardware Co-Design Framework of DNNs Using Alternating Direction Method of Multipliers

1 code implementation • 31 Dec 2018 • Ao Ren, Tianyun Zhang, Shaokai Ye, Jiayu Li, Wenyao Xu, Xuehai Qian, Xue Lin, Yanzhi Wang

The first part of ADMM-NN is a systematic, joint framework of DNN weight pruning and quantization using ADMM.

Model Compression Quantization

Paper
Code

E-RNN: Design Optimization for Efficient Recurrent Neural Networks in FPGAs

no code implementations • 12 Dec 2018 • Zhe Li, Caiwen Ding, Siyue Wang, Wujie Wen, Youwei Zhuo, Chang Liu, Qinru Qiu, Wenyao Xu, Xue Lin, Xuehai Qian, Yanzhi Wang

It is a challenging task to have real-time, efficient, and accurate hardware RNN implementations because of the high sensitivity to imprecision accumulation and the requirement of special activation function implementations.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

A Unified Framework of DNN Weight Pruning and Weight Clustering/Quantization Using ADMM

no code implementations • 5 Nov 2018 • Shaokai Ye, Tianyun Zhang, Kaiqi Zhang, Jiayu Li, Jiaming Xie, Yun Liang, Sijia Liu, Xue Lin, Yanzhi Wang

Both DNN weight pruning and clustering/quantization, as well as their combinations, can be solved in a unified manner.

Clustering Model Compression +1

Paper
Add Code

Progressive Weight Pruning of Deep Neural Networks using ADMM

no code implementations • ICLR 2019 • Shaokai Ye, Tianyun Zhang, Kaiqi Zhang, Jiayu Li, Kaidi Xu, Yunfei Yang, Fuxun Yu, Jian Tang, Makan Fardad, Sijia Liu, Xiang Chen, Xue Lin, Yanzhi Wang

Motivated by dynamic programming, the proposed method reaches extremely high pruning rate by using partial prunings with moderate pruning rates.

Edge-computing Model Compression

Paper
Add Code

Interpreting Adversarial Robustness: A View from Decision Surface in Input Space

no code implementations • ICLR 2019 • Fuxun Yu, ChenChen Liu, Yanzhi Wang, Liang Zhao, Xiang Chen

One popular hypothesis of neural network generalization is that the flat local minima of loss surface in parameter space leads to good generalization.

Adversarial Robustness

Paper
Add Code

Structured Adversarial Attack: Towards General Implementation and Better Interpretability

1 code implementation • ICLR 2019 • Kaidi Xu, Sijia Liu, Pu Zhao, Pin-Yu Chen, huan zhang, Quanfu Fan, Deniz Erdogmus, Yanzhi Wang, Xue Lin

When generating adversarial examples to attack deep neural networks (DNNs), Lp norm of the added perturbation is usually used to measure the similarity between original image and adversarial example.

Adversarial Attack

Paper
Code

StructADMM: A Systematic, High-Efficiency Framework of Structured Weight Pruning for DNNs

1 code implementation • 29 Jul 2018 • Tianyun Zhang, Shaokai Ye, Kaiqi Zhang, Xiaolong Ma, Ning Liu, Linfeng Zhang, Jian Tang, Kaisheng Ma, Xue Lin, Makan Fardad, Yanzhi Wang

Without loss of accuracy on the AlexNet model, we achieve 2. 58X and 3. 65X average measured speedup on two GPUs, clearly outperforming the prior work.

Model Compression

Paper
Code

Adversarial Meta-Learning

no code implementations • 8 Jun 2018 • Chengxiang Yin, Jian Tang, Zhiyuan Xu, Yanzhi Wang

Meta-learning enables a model to learn from very limited data to undertake a new task.

Meta-Learning

Paper
Add Code

Towards Robust Training of Neural Networks by Regularizing Adversarial Gradients

no code implementations • 23 May 2018 • Fuxun Yu, Zirui Xu, Yanzhi Wang, ChenChen Liu, Xiang Chen

In recent years, neural networks have demonstrated outstanding effectiveness in a large amount of applications. However, recent works have shown that neural networks are susceptible to adversarial examples, indicating possible flaws intrinsic to the network structures.

Paper
Add Code

Towards Budget-Driven Hardware Optimization for Deep Convolutional Neural Networks using Stochastic Computing

no code implementations • 10 May 2018 • Zhe Li, Ji Li, Ao Ren, Caiwen Ding, Jeffrey Draper, Qinru Qiu, Bo Yuan, Yanzhi Wang

Recently, Deep Convolutional Neural Network (DCNN) has achieved tremendous success in many machine learning applications.

Paper
Add Code

Learning Topics using Semantic Locality

no code implementations • 11 Apr 2018 • Ziyi Zhao, Krittaphat Pugdeethosapol, Sheng Lin, Zhe Li, Caiwen Ding, Yanzhi Wang, Qinru Qiu

The topic modeling discovers the latent topic probability of the given text documents.

Topic Models

Paper
Add Code

A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers

3 code implementations • ECCV 2018 • Tianyun Zhang, Shaokai Ye, Kaiqi Zhang, Jian Tang, Wujie Wen, Makan Fardad, Yanzhi Wang

We first formulate the weight pruning problem of DNNs as a nonconvex optimization problem with combinatorial constraints specifying the sparsity requirements, and then adopt the ADMM framework for systematic weight pruning.

Image Classification Network Pruning

103

Paper
Code

An ADMM-Based Universal Framework for Adversarial Attacks on Deep Neural Networks

no code implementations • 9 Apr 2018 • Pu Zhao, Sijia Liu, Yanzhi Wang, Xue Lin

In the literature, the added distortions are usually measured by L0, L1, L2, and L infinity norms, namely, L0, L1, L2, and L infinity attacks, respectively.

Adversarial Attack

Paper
Add Code

Structured Weight Matrices-Based Hardware Accelerators in Deep Neural Networks: FPGAs and ASICs

no code implementations • 28 Mar 2018 • Caiwen Ding, Ao Ren, Geng Yuan, Xiaolong Ma, Jiayu Li, Ning Liu, Bo Yuan, Yanzhi Wang

For FPGA implementations on deep convolutional neural networks (DCNNs), we achieve at least 152X and 72X improvement in performance and energy efficiency, respectively using the SWM-based framework, compared with the baseline of IBM TrueNorth processor under same accuracy constraints using the data set of MNIST, SVHN, and CIFAR-10.

Paper
Add Code

Efficient Recurrent Neural Networks using Structured Matrices in FPGAs

no code implementations • 20 Mar 2018 • Zhe Li, Shuo Wang, Caiwen Ding, Qinru Qiu, Yanzhi Wang, Yun Liang

Recurrent Neural Networks (RNNs) are becoming increasingly important for time series-related applications which require efficient and real-time implementations.

Model Compression Time Series +1

Paper
Add Code

C3PO: Database and Benchmark for Early-stage Malicious Activity Detection in 3D Printing

no code implementations • 20 Mar 2018 • Zhe Li, Xiaolong Ma, Hongjia Li, Qiyuan An, Aditya Singh Rathore, Qinru Qiu, Wenyao Xu, Yanzhi Wang

It is of vital importance to enable 3D printers to identify the objects to be printed, so that the manufacturing procedure of an illegal weapon can be terminated at the early stage.

Action Detection Activity Detection +1

Paper
Add Code

DeepN-JPEG: A Deep Neural Network Favorable JPEG-based Image Compression Framework

no code implementations • 14 Mar 2018 • Zihao Liu, Tao Liu, Wujie Wen, Lei Jiang, Jie Xu, Yanzhi Wang, Gang Quan

To reduce the data storage and transfer overhead in smart resource-limited Internet-of-Thing (IoT) systems, effective data compression is a "must-have" feature before transferring real-time produced dataset for training or classification.

Data Compression General Classification +2

Paper
Add Code

C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs

no code implementations • 14 Mar 2018 • Shuo Wang, Zhe Li, Caiwen Ding, Bo Yuan, Yanzhi Wang, Qinru Qiu, Yun Liang

The previous work proposes to use a pruning based compression technique to reduce the model size and thus speedups the inference on FPGAs.

Paper
Add Code

Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples

2 code implementations • CVPR 2019 • Zihao Liu, Qi Liu, Tao Liu, Nuo Xu, Xue Lin, Yanzhi Wang, Wujie Wen

Image compression-based approaches for defending against the adversarial-example attacks, which threaten the safety use of deep neural networks (DNN), have been investigated recently.

Classification General Classification +2

Paper
Code

On the Universal Approximation Property and Equivalence of Stochastic Computing-based Neural Networks and Binary Neural Networks

no code implementations • 14 Mar 2018 • Yanzhi Wang, Zheng Zhan, Jiayu Li, Jian Tang, Bo Yuan, Liang Zhao, Wujie Wen, Siyue Wang, Xue Lin

Based on the universal approximation property, we further prove that SCNNs and BNNs exhibit the same energy complexity.

Paper
Add Code

Model-Free Control for Distributed Stream Data Processing using Deep Reinforcement Learning

no code implementations • 2 Mar 2018 • Teng Li, Zhiyuan Xu, Jian Tang, Yanzhi Wang

Specifically, we, for the first time, propose to leverage emerging Deep Reinforcement Learning (DRL) for enabling model-free control in DSDPSs; and present design, implementation and evaluation of a novel and highly effective DRL-based control framework, which minimizes average end-to-end tuple processing time by jointly learning the system environment via collecting very limited runtime statistics data and making decisions under the guidance of powerful Deep Neural Networks.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Towards Ultra-High Performance and Energy Efficiency of Deep Learning Systems: An Algorithm-Hardware Co-Optimization Framework

no code implementations • 18 Feb 2018 • Yanzhi Wang, Caiwen Ding, Zhe Li, Geng Yuan, Siyu Liao, Xiaolong Ma, Bo Yuan, Xuehai Qian, Jian Tang, Qinru Qiu, Xue Lin

Hardware accelerations of deep learning systems have been extensively investigated in industry and academia.

Paper
Add Code

Image Dataset for Visual Objects Classification in 3D Printing

no code implementations • 15 Feb 2018 • Hongjia Li, Xiaolong Ma, Aditya Singh Rathore, Zhe Li, Qiyuan An, Chen Song, Wenyao Xu, Yanzhi Wang

The rapid development in additive manufacturing (AM), also known as 3D printing, has brought about potential risk and security issues along with significant benefits.

Classification General Classification

Paper
Add Code

Systematic Weight Pruning of DNNs using Alternating Direction Method of Multipliers

1 code implementation • 15 Feb 2018 • Tianyun Zhang, Shaokai Ye, Yi-Peng Zhang, Yanzhi Wang, Makan Fardad

We present a systematic weight pruning framework of deep neural networks (DNNs) using the alternating direction method of multipliers (ADMM).

Computational Efficiency

103

Paper
Code

Security Analysis and Enhancement of Model Compressed Deep Learning Systems under Adversarial Attacks

no code implementations • 14 Feb 2018 • Qi Liu, Tao Liu, Zihao Liu, Yanzhi Wang, Yier Jin, Wujie Wen

In this work, we for the first time investigate the multi-factor adversarial attack problem in practical model optimized deep learning systems by jointly considering the DNN model-reshaping (e. g. HashNet based deep compression) and the input perturbations.

Adversarial Attack

Paper
Add Code

An Area and Energy Efficient Design of Domain-Wall Memory-Based Deep Convolutional Neural Networks using Stochastic Computing

no code implementations • 3 Feb 2018 • Xiaolong Ma, Yi-Peng Zhang, Geng Yuan, Ao Ren, Zhe Li, Jie Han, Jingtong Hu, Yanzhi Wang

However, in these works, the memory design optimization is neglected for weight storage, which will inevitably result in large hardware cost.

Paper
Add Code

VIBNN: Hardware Acceleration of Bayesian Neural Networks

no code implementations • 2 Feb 2018 • Ruizhe Cai, Ao Ren, Ning Liu, Caiwen Ding, Luhao Wang, Xuehai Qian, Massoud Pedram, Yanzhi Wang

In this paper, we propose VIBNN, an FPGA-based hardware accelerator design for variational inference on BNNs.

Small Data Image Classification Variational Inference

Paper
Add Code

Deep Reinforcement Learning for Dynamic Treatment Regimes on Medical Registry Data

no code implementations • 28 Jan 2018 • Ning Liu, Ying Liu, Brent Logan, Zhiyuan Xu, Jian Tang, Yanzhi Wang

This paper presents the first deep reinforcement learning (DRL) framework to estimate the optimal Dynamic Treatment Regimes from observational medical data.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Experience-driven Networking: A Deep Reinforcement Learning based Approach

no code implementations • 17 Jan 2018 • Zhiyuan Xu, Jian Tang, Jingsong Meng, Weiyi Zhang, Yanzhi Wang, Chi Harold Liu, Dejun Yang

Modern communication networks have become very complicated and highly dynamic, which makes them hard to model, predict and control.

Continuous Control reinforcement-learning +1

Paper
Add Code

FFT-Based Deep Learning Deployment in Embedded Systems

no code implementations • 13 Dec 2017 • Sheng Lin, Ning Liu, Mahdi Nazemi, Hongjia Li, Caiwen Ding, Yanzhi Wang, Massoud Pedram

The large model size of DNNs, while providing excellent accuracy, also burdens the embedded platforms with intensive computation and storage.

speech-recognition Speech Recognition

Paper
Add Code

A Memristor-Based Optimization Framework for AI Applications

no code implementations • 18 Oct 2017 • Sijia Liu, Yanzhi Wang, Makan Fardad, Pramod K. Varshney

In addition to ADMM, implementation of a customized power iteration (PI) method for eigenvalue/eigenvector computation using memristor crossbars is discussed.

BIG-bench Machine Learning

Paper
Add Code

Deep Reinforcement Learning: Framework, Applications, and Embedded Implementations

no code implementations • 10 Oct 2017 • Hongjia Li, Tianshu Wei, Ao Ren, Qi Zhu, Yanzhi Wang

The recent breakthroughs of deep reinforcement learning (DRL) technique in Alpha Go and playing Atari have set a good example in handling large state and actions spaces of complicated control problems.

Cloud Computing Q-Learning +3

Paper
Add Code

CirCNN: Accelerating and Compressing Deep Neural Networks Using Block-CirculantWeight Matrices

no code implementations • 29 Aug 2017 • Caiwen Ding, Siyu Liao, Yanzhi Wang, Zhe Li, Ning Liu, Youwei Zhuo, Chao Wang, Xuehai Qian, Yu Bai, Geng Yuan, Xiaolong Ma, Yi-Peng Zhang, Jian Tang, Qinru Qiu, Xue Lin, Bo Yuan

As the size of DNNs continues to grow, it is critical to improve the energy efficiency and performance while maintaining accuracy.

Paper
Add Code

A Hierarchical Framework of Cloud Resource Allocation and Power Management Using Deep Reinforcement Learning

no code implementations • 13 Mar 2017 • Ning Liu, Zhe Li, Zhiyuan Xu, Jielong Xu, Sheng Lin, Qinru Qiu, Jian Tang, Yanzhi Wang

Automatic decision-making approaches, such as reinforcement learning (RL), have been applied to (partially) solve the resource allocation problem adaptively in the cloud computing system.

Cloud Computing Decision Making +3

Paper
Add Code

Hardware-Driven Nonlinear Activation for Stochastic Computing Based Deep Convolutional Neural Networks

no code implementations • 12 Mar 2017 • Ji Li, Zihao Yuan, Zhe Li, Caiwen Ding, Ao Ren, Qinru Qiu, Jeffrey Draper, Yanzhi Wang

Recently, Deep Convolutional Neural Networks (DCNNs) have made unprecedented progress, achieving the accuracy close to, or even better than human-level perception in various tasks.

Paper
Add Code

Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank

no code implementations • ICML 2017 • Liang Zhao, Siyu Liao, Yanzhi Wang, Zhe Li, Jian Tang, Victor Pan, Bo Yuan

Recently low displacement rank (LDR) matrices, or so-called structured matrices, have been proposed to compress large-scale neural networks.

Paper
Add Code

SC-DCNN: Highly-Scalable Deep Convolutional Neural Network using Stochastic Computing

no code implementations • 18 Nov 2016 • Ao Ren, Ji Li, Zhe Li, Caiwen Ding, Xuehai Qian, Qinru Qiu, Bo Yuan, Yanzhi Wang

Stochastic Computing (SC), which uses bit-stream to represent a number within [-1, 1] by counting the number of ones in the bit-stream, has a high potential for implementing DCNNs with high scalability and ultra-low hardware footprint.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.