Search Results for author: Yanzhi Wang

Found 159 papers, 36 papers with code

Efficient Pruning of Large Language Model with Adaptive Estimation Fusion

no code implementations16 Mar 2024 Jun Liu, Chao Wu, Changdi Yang, Hao Tang, Haoye Dong, Zhenglun Kong, Geng Yuan, Wei Niu, Dong Huang, Yanzhi Wang

Large language models (LLMs) have become crucial for many generative downstream tasks, leading to an inevitable trend and significant challenge to deploy them efficiently on resource-constrained devices.

DiffClass: Diffusion-Based Class Incremental Learning

no code implementations8 Mar 2024 Zichong Meng, Jie Zhang, Changdi Yang, Zheng Zhan, Pu Zhao, Yanzhi Wang

On top of that, Exemplar-free Class Incremental Learning is even more challenging due to forbidden access to previous task data.

Class Incremental Learning Domain Adaptation +2

InstructGIE: Towards Generalizable Image Editing

no code implementations8 Mar 2024 Zichong Meng, Changdi Yang, Jun Liu, Hao Tang, Pu Zhao, Yanzhi Wang

In response to this challenge, our study introduces a novel image editing framework with enhanced generalization robustness by boosting in-context learning capability and unifying language instruction.

Denoising In-Context Learning

Dynamic Gaussian Graph Operator: Learning parametric partial differential equations in arbitrary discrete mechanics problems

no code implementations5 Mar 2024 Chu Wang, Jinhong Wu, Yanzhi Wang, Zhijian Zha, Qi Zhou

Metric vectors are regarded as located on latent uniform domain, wherein spatial and spectral transformation offer highly regular constraints on solution space.

Operator learning

EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge

1 code implementation16 Feb 2024 Xuan Shen, Zhenglun Kong, Changdi Yang, Zhaoyang Han, Lei Lu, Peiyan Dong, Cheng Lyu, Chih-hsiang Li, Xuehang Guo, Zhihao Shu, Wei Niu, Miriam Leeser, Pu Zhao, Yanzhi Wang

In this paper, we propose EdgeQAT, the Entropy and Distribution Guided QAT for the optimization of lightweight LLMs to achieve inference acceleration on Edge devices.

Quantization

EdgeOL: Efficient in-situ Online Learning on Edge Devices

no code implementations30 Jan 2024 Sheng Li, Geng Yuan, Yawen Wu, Yue Dai, Chao Wu, Alex K. Jones, Jingtong Hu, Yanzhi Wang, Xulong Tang

Emerging applications, such as robot-assisted eldercare and object recognition, generally employ deep learning neural networks (DNNs) and naturally require: i) handling streaming-in inference requests and ii) adapting to possible deployment scenario changes.

Object Recognition

SmartFRZ: An Efficient Training Framework using Attention-Based Layer Freezing

no code implementations30 Jan 2024 Sheng Li, Geng Yuan, Yue Dai, Youtao Zhang, Yanzhi Wang, Xulong Tang

Therefore, there lacks a generic and smart layer freezing method that can automatically perform ``in-situation'' layer freezing for different networks during training processes.

E$^{2}$GAN: Efficient Training of Efficient GANs for Image-to-Image Translation

no code implementations11 Jan 2024 Yifan Gong, Zheng Zhan, Qing Jin, Yanyu Li, Yerlan Idelbayev, Xian Liu, Andrey Zharkov, Kfir Aberman, Sergey Tulyakov, Yanzhi Wang, Jian Ren

One highly promising direction for enabling flexible real-time on-device image editing is utilizing data distillation by leveraging large-scale text-to-image diffusion models, such as Stable Diffusion, to generate paired datasets used for training generative adversarial networks (GANs).

Image-to-Image Translation

Agile-Quant: Activation-Guided Quantization for Faster Inference of LLMs on the Edge

no code implementations9 Dec 2023 Xuan Shen, Peiyan Dong, Lei Lu, Zhenglun Kong, Zhengang Li, Ming Lin, Chao Wu, Yanzhi Wang

Recent works show that 8-bit or lower weight quantization is feasible with minimal impact on end-to-end task performance, while the activation is still not quantized.

Language Modelling Quantization

SupeRBNN: Randomized Binary Neural Network Using Adiabatic Superconductor Josephson Devices

no code implementations21 Sep 2023 Zhengang Li, Geng Yuan, Tomoharu Yamauchi, Zabihi Masoud, Yanyue Xie, Peiyan Dong, Xulong Tang, Nobuyuki Yoshikawa, Devesh Tiwari, Yanzhi Wang, Olivia Chen

Specifically, we investigate the randomized behavior of the AQFP devices and analyze the impact of crossbar size on current attenuation, subsequently formulating the current amplitude into the values suitable for use in BNN computation.

Can Adversarial Examples Be Parsed to Reveal Victim Model Information?

1 code implementation13 Mar 2023 Yuguang Yao, Jiancheng Liu, Yifan Gong, Xiaoming Liu, Yanzhi Wang, Xue Lin, Sijia Liu

We call this 'model parsing of adversarial attacks' - a task to uncover 'arcana' in terms of the concealed VM information in attacks.

Adversarial Attack

DeepMAD: Mathematical Architecture Design for Deep Convolutional Neural Network

1 code implementation CVPR 2023 Xuan Shen, Yaohua Wang, Ming Lin, Yilun Huang, Hao Tang, Xiuyu Sun, Yanzhi Wang

To this end, a novel framework termed Mathematical Architecture Design for Deep CNN (DeepMAD) is proposed to design high-performance CNN models in a principled way.

Image Classification Neural Architecture Search

Pruning Parameterization With Bi-Level Optimization for Efficient Semantic Segmentation on the Edge

no code implementations CVPR 2023 Changdi Yang, Pu Zhao, Yanyu Li, Wei Niu, Jiexiong Guan, Hao Tang, Minghai Qin, Bin Ren, Xue Lin, Yanzhi Wang

With the ever-increasing popularity of edge devices, it is necessary to implement real-time segmentation on the edge for autonomous driving and many other applications.

Autonomous Driving Segmentation +1

Rethinking Vision Transformers for MobileNet Size and Speed

6 code implementations ICCV 2023 Yanyu Li, Ju Hu, Yang Wen, Georgios Evangelidis, Kamyar Salahi, Yanzhi Wang, Sergey Tulyakov, Jian Ren

With the success of Vision Transformers (ViTs) in computer vision tasks, recent arts try to optimize the performance and complexity of ViTs to enable efficient deployment on mobile devices.

All-in-One: A Highly Representative DNN Pruning Framework for Edge Devices with Dynamic Power Management

no code implementations9 Dec 2022 Yifan Gong, Zheng Zhan, Pu Zhao, Yushu Wu, Chao Wu, Caiwen Ding, Weiwen Jiang, Minghai Qin, Yanzhi Wang

By re-configuring the model to the corresponding pruning ratio for a specific execution frequency (and voltage), we are able to achieve stable inference speed, i. e., keeping the difference in speed performance under various execution frequencies as small as possible.

Management

Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors

1 code implementation22 Nov 2022 Sizhe Chen, Geng Yuan, Xinwen Cheng, Yifan Gong, Minghai Qin, Yanzhi Wang, Xiaolin Huang

In this paper, we uncover them by model checkpoints' gradients, forming the proposed self-ensemble protection (SEP), which is very effective because (1) learning on examples ignored during normal training tends to yield DNNs ignoring normal examples; (2) checkpoints' cross-model gradients are close to orthogonal, meaning that they are as diverse as DNNs with different architectures.

You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model

1 code implementation CVPR 2023 Shengkun Tang, Yaqing Wang, Zhenglun Kong, Tianchi Zhang, Yao Li, Caiwen Ding, Yanzhi Wang, Yi Liang, Dongkuan Xu

To handle this challenge, we propose a novel early exiting strategy for unified visual language models, which allows dynamically skip the layers in encoder and decoder simultaneously in term of input layer-wise similarities with multiple times of early exiting, namely \textbf{MuE}.

Language Modelling

Peeling the Onion: Hierarchical Reduction of Data Redundancy for Efficient Vision Transformer Training

1 code implementation19 Nov 2022 Zhenglun Kong, Haoyu Ma, Geng Yuan, Mengshu Sun, Yanyue Xie, Peiyan Dong, Xin Meng, Xuan Shen, Hao Tang, Minghai Qin, Tianlong Chen, Xiaolong Ma, Xiaohui Xie, Zhangyang Wang, Yanzhi Wang

Vision transformers (ViTs) have recently obtained success in many applications, but their intensive computation and heavy memory usage at both training and inference time limit their generalization.

HeatViT: Hardware-Efficient Adaptive Token Pruning for Vision Transformers

no code implementations15 Nov 2022 Peiyan Dong, Mengshu Sun, Alec Lu, Yanyue Xie, Kenneth Liu, Zhenglun Kong, Xin Meng, Zhengang Li, Xue Lin, Zhenman Fang, Yanzhi Wang

While vision transformers (ViTs) have continuously achieved new milestones in the field of computer vision, their sophisticated network architectures with high computation and memory costs have impeded their deployment on resource-limited edge devices.

Quantization

Data Level Lottery Ticket Hypothesis for Vision Transformers

1 code implementation2 Nov 2022 Xuan Shen, Zhenglun Kong, Minghai Qin, Peiyan Dong, Geng Yuan, Xin Meng, Hao Tang, Xiaolong Ma, Yanzhi Wang

That is, there exists a subset of input image patches such that a ViT can be trained from scratch by using only this subset of patches and achieve similar accuracy to the ViTs trained by using all image patches.

Analogical Similarity Informativeness

Advancing Model Pruning via Bi-level Optimization

1 code implementation8 Oct 2022 Yihua Zhang, Yuguang Yao, Parikshit Ram, Pu Zhao, Tianlong Chen, Mingyi Hong, Yanzhi Wang, Sijia Liu

To reduce the computation overhead, various efficient 'one-shot' pruning methods have been developed, but these schemes are usually unable to find winning tickets as good as IMP.

Layer Freezing & Data Sieving: Missing Pieces of a Generic Framework for Sparse Training

1 code implementation22 Sep 2022 Geng Yuan, Yanyu Li, Sheng Li, Zhenglun Kong, Sergey Tulyakov, Xulong Tang, Yanzhi Wang, Jian Ren

Therefore, we analyze the feasibility and potentiality of using the layer freezing technique in sparse training and find it has the potential to save considerable training costs.

SparCL: Sparse Continual Learning on the Edge

1 code implementation20 Sep 2022 Zifeng Wang, Zheng Zhan, Yifan Gong, Geng Yuan, Wei Niu, Tong Jian, Bin Ren, Stratis Ioannidis, Yanzhi Wang, Jennifer Dy

SparCL achieves both training acceleration and accuracy preservation through the synergy of three aspects: weight sparsity, data efficiency, and gradient sparsity.

Continual Learning

PIM-QAT: Neural Network Quantization for Processing-In-Memory (PIM) Systems

no code implementations18 Sep 2022 Qing Jin, Zhiyu Chen, Jian Ren, Yanyu Li, Yanzhi Wang, Kaiyuan Yang

In this paper, we propose a method for training quantized networks to incorporate PIM quantization, which is ubiquitous to all PIM systems.

Quantization

Understanding Time Variations of DNN Inference in Autonomous Driving

no code implementations12 Sep 2022 Liangkai Liu, Yanzhi Wang, Weisong Shi

Understanding the time variations of the DNN inference becomes a fundamental challenge in real-time scheduling for autonomous driving.

Autonomous Driving Scheduling

Survey: Exploiting Data Redundancy for Optimization of Deep Learning

no code implementations29 Aug 2022 Jou-An Chen, Wei Niu, Bin Ren, Yanzhi Wang, Xipeng Shen

It surveys hundreds of recent papers on the topic, introduces a novel taxonomy to put the various techniques into a single categorization framework, offers a comprehensive description of the main methods used for exploiting data redundancy in improving multiple kinds of DNNs on data, and points out a set of research opportunities for future to explore.

Compiler-Aware Neural Architecture Search for On-Mobile Real-time Super-Resolution

1 code implementation25 Jul 2022 Yushu Wu, Yifan Gong, Pu Zhao, Yanyu Li, Zheng Zhan, Wei Niu, Hao Tang, Minghai Qin, Bin Ren, Yanzhi Wang

Instead of measuring the speed on mobile devices at each iteration during the search process, a speed model incorporated with compiler optimizations is leveraged to predict the inference latency of the SR block with various width configurations for faster convergence.

Neural Architecture Search SSIM +1

Continual Few-Shot Learning with Adversarial Class Storage

no code implementations10 Jul 2022 Kun Wu, Chengxiang Yin, Jian Tang, Zhiyuan Xu, Yanzhi Wang, Dejun Yang

In this paper, we define a new problem called continual few-shot learning, in which tasks arrive sequentially and each task is associated with a few training samples.

continual few-shot learning Few-Shot Learning +1

Quantum Neural Network Compression

no code implementations4 Jul 2022 Zhirui Hu, Peiyan Dong, Zhepeng Wang, Youzuo Lin, Yanzhi Wang, Weiwen Jiang

Model compression, such as pruning and quantization, has been widely applied to optimize neural networks on resource-limited classical devices.

Neural Network Compression Quantization

CoCoPIE XGen: A Full-Stack AI-Oriented Optimizing Framework

no code implementations21 Jun 2022 Xiaofeng Li, Bin Ren, Xipeng Shen, Yanzhi Wang

There is a growing demand for shifting the delivery of AI capability from data centers on the cloud to edge or end devices, exemplified by the fast emerging real-time AI-based apps running on smartphones, AR/VR devices, autonomous vehicles, and various IoT devices.

Autonomous Vehicles

Understanding EFL Student Idea Generation Strategies for Creative Writing with NLG Tools

no code implementations4 Jun 2022 David James Woo, Yanzhi Wang, Hengky Susanto, Kai Guo

This study explores strategies adopted by EFL students when searching for ideas using NLG tools, evaluating ideas generated by NLG tools and selecting NLG tools for ideas generation.

Text Generation

Real-Time Portrait Stylization on the Edge

no code implementations2 Jun 2022 Yanyu Li, Xuan Shen, Geng Yuan, Jiexiong Guan, Wei Niu, Hao Tang, Bin Ren, Yanzhi Wang

In this work we demonstrate real-time portrait stylization, specifically, translating self-portrait into cartoon or anime style on mobile devices.

EfficientFormer: Vision Transformers at MobileNet Speed

9 code implementations2 Jun 2022 Yanyu Li, Geng Yuan, Yang Wen, Ju Hu, Georgios Evangelidis, Sergey Tulyakov, Yanzhi Wang, Jian Ren

Our work proves that properly designed transformers can reach extremely low latency on mobile devices while maintaining high performance.

Pruning-as-Search: Efficient Neural Architecture Search via Channel Pruning and Structural Reparameterization

1 code implementation2 Jun 2022 Yanyu Li, Pu Zhao, Geng Yuan, Xue Lin, Yanzhi Wang, Xin Chen

By combining the structural reparameterization and PaS, we successfully searched out a new family of VGG-like and lightweight networks, which enable the flexibility of arbitrary width with respect to each layer instead of each stage.

Instance Segmentation Network Pruning +2

Neural Network-based OFDM Receiver for Resource Constrained IoT Devices

no code implementations12 May 2022 Nasim Soltani, Hai Cheng, Mauro Belgiovine, Yanyu Li, Haoqing Li, Bahar Azari, Salvatore D'Oro, Tales Imbiriba, Tommaso Melodia, Pau Closas, Yanzhi Wang, Deniz Erdogmus, Kaushik Chowdhury

Here, ML blocks replace the individual processing blocks of an OFDM receiver, and we specifically describe this swapping for the legacy channel estimation, symbol demapping, and decoding blocks with Neural Networks (NNs).

Quantization

Deep neural network goes lighter: A case study of deep compression techniques on automatic RF modulation recognition for Beyond 5G networks

no code implementations9 Apr 2022 Anu Jagannath, Jithin Jagannath, Yanzhi Wang, Tommaso Melodia

Automatic RF modulation recognition is a primary signal intelligence (SIGINT) technique that serves as a physical layer authentication enabler and automated signal processing scheme for the beyond 5G and military networks.

F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

1 code implementation ICLR 2022 Qing Jin, Jian Ren, Richard Zhuang, Sumant Hanumante, Zhengang Li, Zhiyu Chen, Yanzhi Wang, Kaiyuan Yang, Sergey Tulyakov

Our approach achieves comparable and better performance, when compared not only to existing quantization techniques with INT32 multiplication or floating-point arithmetic, but also to the full-precision counterparts, achieving state-of-the-art performance.

Quantization

Coarsening the Granularity: Towards Structurally Sparse Lottery Tickets

1 code implementation9 Feb 2022 Tianlong Chen, Xuxi Chen, Xiaolong Ma, Yanzhi Wang, Zhangyang Wang

The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse subnetworks (i. e., winning tickets) that can be trained in isolation to match full accuracy.

AirNN: Neural Networks with Over-the-Air Convolution via Reconfigurable Intelligent Surfaces

no code implementations7 Feb 2022 Sara Garcia Sanchez, Guillem Reus Muns, Carlos Bocanegra, Yanyu Li, Ufuk Muncuk, Yousof Naderi, Yanzhi Wang, Stratis Ioannidis, Kaushik R. Chowdhury

In this paper, we design and implement the first-of-its-kind over-the-air convolution and demonstrate it for inference tasks in a convolutional neural network (CNN).

VAQF: Fully Automatic Software-Hardware Co-Design Framework for Low-Bit Vision Transformer

no code implementations17 Jan 2022 Mengshu Sun, Haoyu Ma, Guoliang Kang, Yifan Jiang, Tianlong Chen, Xiaolong Ma, Zhangyang Wang, Yanzhi Wang

To the best of our knowledge, this is the first time quantization has been incorporated into ViT acceleration on FPGAs with the help of a fully automatic framework to guide the quantization strategy on the software side and the accelerator implementations on the hardware side given the target frame rate.

Quantization

SPViT: Enabling Faster Vision Transformers via Soft Token Pruning

1 code implementation27 Dec 2021 Zhenglun Kong, Peiyan Dong, Xiaolong Ma, Xin Meng, Mengshu Sun, Wei Niu, Xuan Shen, Geng Yuan, Bin Ren, Minghai Qin, Hao Tang, Yanzhi Wang

Moreover, our framework can guarantee the identified model to meet resource specifications of mobile devices and FPGA, and even achieve the real-time execution of DeiT-T on mobile platforms.

Image Classification Model Compression

Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting

no code implementations21 Dec 2021 Minghai Qin, Tianyun Zhang, Fei Sun, Yen-Kuang Chen, Makan Fardad, Yanzhi Wang, Yuan Xie

Deep neural networks (DNNs) have shown to provide superb performance in many real life applications, but their large computation cost and storage requirement have prevented them from being deployed to many edge and internet-of-things (IoT) devices.

Graph Attention

Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration

no code implementations22 Nov 2021 Yifan Gong, Geng Yuan, Zheng Zhan, Wei Niu, Zhengang Li, Pu Zhao, Yuxuan Cai, Sijia Liu, Bin Ren, Xue Lin, Xulong Tang, Yanzhi Wang

Weight pruning is an effective model compression technique to tackle the challenges of achieving real-time deep neural network (DNN) inference on mobile devices.

Model Compression

RMSMP: A Novel Deep Neural Network Quantization Framework with Row-wise Mixed Schemes and Multiple Precisions

no code implementations ICCV 2021 Sung-En Chang, Yanyu Li, Mengshu Sun, Weiwen Jiang, Sijia Liu, Yanzhi Wang, Xue Lin

Specifically, this is the first effort to assign mixed quantization schemes and multiple precisions within layers -- among rows of the DNN weight matrix, for simplified operations in hardware inference, while preserving accuracy.

Image Classification Quantization

ScaleCert: Scalable Certified Defense against Adversarial Patches with Sparse Superficial Layers

no code implementations NeurIPS 2021 Husheng Han, Kaidi Xu, Xing Hu, Xiaobing Chen, Ling Liang, Zidong Du, Qi Guo, Yanzhi Wang, Yunji Chen

Our experimental results show that the certified accuracy is increased from 36. 3% (the state-of-the-art certified detection) to 60. 4% on the ImageNet dataset, largely pushing the certified defenses for practical use.

MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge

1 code implementation NeurIPS 2021 Geng Yuan, Xiaolong Ma, Wei Niu, Zhengang Li, Zhenglun Kong, Ning Liu, Yifan Gong, Zheng Zhan, Chaoyang He, Qing Jin, Siyue Wang, Minghai Qin, Bin Ren, Yanzhi Wang, Sijia Liu, Xue Lin

Systematical evaluation on accuracy, training speed, and memory footprint are conducted, where the proposed MEST framework consistently outperforms representative SOTA works.

HFSP: A Hardware-friendly Soft Pruning Framework for Vision Transformers

no code implementations29 Sep 2021 Zhenglun Kong, Peiyan Dong, Xiaolong Ma, Xin Meng, Mengshu Sun, Wei Niu, Bin Ren, Minghai Qin, Hao Tang, Yanzhi Wang

Recently, Vision Transformer (ViT) has continuously established new milestones in the computer vision field, while the high computation and memory cost makes its propagation in industrial production difficult.

Image Classification Model Compression

Lottery Tickets can have Structural Sparsity

no code implementations29 Sep 2021 Tianlong Chen, Xuxi Chen, Xiaolong Ma, Yanzhi Wang, Zhangyang Wang

The lottery ticket hypothesis (LTH) has shown that dense models contain highly sparse subnetworks (i. e., $\textit{winning tickets}$) that can be trained in isolation to match full accuracy.

DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion

no code implementations30 Aug 2021 Wei Niu, Jiexiong Guan, Yanzhi Wang, Gagan Agrawal, Bin Ren

Deep Neural Networks (DNNs) have emerged as the core enabler of many major applications on mobile devices.

Code Generation

GRIM: A General, Real-Time Deep Learning Inference Framework for Mobile Devices based on Fine-Grained Structured Weight Sparsity

no code implementations25 Aug 2021 Wei Niu, Zhengang Li, Xiaolong Ma, Peiyan Dong, Gang Zhou, Xuehai Qian, Xue Lin, Yanzhi Wang, Bin Ren

It necessitates the sparse model inference via weight pruning, i. e., DNN weight sparsity, and it is desirable to design a new DNN weight sparsity scheme that can facilitate real-time inference on mobile devices while preserving a high sparse model accuracy.

Code Generation Compiler Optimization

Achieving on-Mobile Real-Time Super-Resolution with Neural Architecture and Pruning Search

no code implementations ICCV 2021 Zheng Zhan, Yifan Gong, Pu Zhao, Geng Yuan, Wei Niu, Yushu Wu, Tianyun Zhang, Malith Jayaweera, David Kaeli, Bin Ren, Xue Lin, Yanzhi Wang

Though recent years have witnessed remarkable progress in single image super-resolution (SISR) tasks with the prosperous development of deep neural networks (DNNs), the deep learning methods are confronted with the computation and memory consumption issues in practice, especially for resource-limited platforms such as mobile devices.

Image Super-Resolution Neural Architecture Search +1

CAP-RAM: A Charge-Domain In-Memory Computing 6T-SRAM for Accurate and Precision-Programmable CNN Inference

no code implementations6 Jul 2021 Zhiyu Chen, Zhanghao Yu, Qing Jin, Yan He, Jingyu Wang, Sheng Lin, Dai Li, Yanzhi Wang, Kaiyuan Yang

A compact, accurate, and bitwidth-programmable in-memory computing (IMC) static random-access memory (SRAM) macro, named CAP-RAM, is presented for energy-efficient convolutional neural network (CNN) inference.

Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win the Jackpot?

2 code implementations NeurIPS 2021 Xiaolong Ma, Geng Yuan, Xuan Shen, Tianlong Chen, Xuxi Chen, Xiaohan Chen, Ning Liu, Minghai Qin, Sijia Liu, Zhangyang Wang, Yanzhi Wang

Based on our analysis, we summarize a guideline for parameter settings in regards of specific architecture characteristics, which we hope to catalyze the research progress on the topic of lottery ticket hypothesis.

Effective Model Sparsification by Scheduled Grow-and-Prune Methods

1 code implementation ICLR 2022 Xiaolong Ma, Minghai Qin, Fei Sun, Zejiang Hou, Kun Yuan, Yi Xu, Yanzhi Wang, Yen-Kuang Chen, Rong Jin, Yuan Xie

It addresses the shortcomings of the previous works by repeatedly growing a subset of layers to dense and then pruning them back to sparse after some training.

Image Classification

FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator

no code implementations16 Jun 2021 Geng Yuan, Payman Behnam, Zhengang Li, Ali Shafiee, Sheng Lin, Xiaolong Ma, Hang Liu, Xuehai Qian, Mahdi Nazm Bojnordi, Yanzhi Wang, Caiwen Ding

With weights stored in the ReRAM crossbar cells as conductance, when the input vector is applied to word lines, the matrix-vector multiplication results can be generated as the current in bit lines.

Efficient Micro-Structured Weight Unification and Pruning for Neural Network Compression

no code implementations15 Jun 2021 Sheng Lin, Wei Jiang, Wei Wang, Kaidi Xu, Yanzhi Wang, Shan Liu, Songnan Li

Compressing Deep Neural Network (DNN) models to alleviate the storage and computation requirements is essential for practical applications, especially for resource limited devices.

Neural Network Compression

Towards Fast and Accurate Multi-Person Pose Estimation on Mobile Devices

no code implementations6 Jun 2021 Xuan Shen, Geng Yuan, Wei Niu, Xiaolong Ma, Jiexiong Guan, Zhengang Li, Bin Ren, Yanzhi Wang

The rapid development of autonomous driving, abnormal behavior detection, and behavior recognition makes an increasing demand for multi-person pose estimation-based applications, especially on mobile platforms.

Autonomous Driving Multi-Person Pose Estimation

A Compression-Compilation Framework for On-mobile Real-time BERT Applications

no code implementations30 May 2021 Wei Niu, Zhenglun Kong, Geng Yuan, Weiwen Jiang, Jiexiong Guan, Caiwen Ding, Pu Zhao, Sijia Liu, Bin Ren, Yanzhi Wang

In this paper, we propose a compression-compilation co-design framework that can guarantee the identified model to meet both resource and real-time specifications of mobile devices.

Question Answering Text Generation

Teachers Do More Than Teach: Compressing Image-to-Image Models

1 code implementation CVPR 2021 Qing Jin, Jian Ren, Oliver J. Woodford, Jiazhuo Wang, Geng Yuan, Yanzhi Wang, Sergey Tulyakov

In this work, we aim to address these issues by introducing a teacher network that provides a search space in which efficient network architectures can be found, in addition to performing knowledge distillation.

Knowledge Distillation

Lottery Ticket Preserves Weight Correlation: Is It Desirable or Not?

no code implementations19 Feb 2021 Ning Liu, Geng Yuan, Zhengping Che, Xuan Shen, Xiaolong Ma, Qing Jin, Jian Ren, Jian Tang, Sijia Liu, Yanzhi Wang

In deep model compression, the recent finding "Lottery Ticket Hypothesis" (LTH) (Frankle & Carbin, 2018) pointed out that there could exist a winning ticket (i. e., a properly pruned sub-network together with original weight initialization) that can achieve competitive performance than the original dense network.

Model Compression

Learn-Prune-Share for Lifelong Learning

1 code implementation13 Dec 2020 Zifeng Wang, Tong Jian, Kaushik Chowdhury, Yanzhi Wang, Jennifer Dy, Stratis Ioannidis

In lifelong learning, we wish to maintain and update a model (e. g., a neural network classifier) in the presence of new classification tasks that arrive sequentially.

Mix and Match: A Novel FPGA-Centric Deep Neural Network Quantization Framework

no code implementations8 Dec 2020 Sung-En Chang, Yanyu Li, Mengshu Sun, Runbin Shi, Hayden K. -H. So, Xuehai Qian, Yanzhi Wang, Xue Lin

Unlike existing methods that use the same quantization scheme for all weights, we propose the first solution that applies different quantization schemes for different rows of the weight matrix.

Edge-computing Model Compression +1

ClickTrain: Efficient and Accurate End-to-End Deep Learning Training via Fine-Grained Architecture-Preserving Pruning

no code implementations20 Nov 2020 Chengming Zhang, Geng Yuan, Wei Niu, Jiannan Tian, Sian Jin, Donglin Zhuang, Zhe Jiang, Yanzhi Wang, Bin Ren, Shuaiwen Leon Song, Dingwen Tao

Moreover, compared with the state-of-the-art pruning-during-training approach, ClickTrain provides significant improvements both accuracy and compression ratio on the tested CNN models and datasets, under similar limited training time.

DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator Search

no code implementations4 Nov 2020 Yushuo Guan, Ning Liu, Pengyu Zhao, Zhengping Che, Kaigui Bian, Yanzhi Wang, Jian Tang

The convolutional neural network has achieved great success in fulfilling computer vision tasks despite large computation overhead against efficient deployment.

Neural Architecture Search

Simultaneous Relevance and Diversity: A New Recommendation Inference Approach

no code implementations27 Sep 2020 Yifang Liu, Zhentao Xu, Qiyuan An, Yang Yi, Yanzhi Wang, Trevor Hastie

Heterogeneous inference achieves divergent relevance, where relevance and diversity support each other as two collaborating objectives in one recommendation model, and where recommendation diversity is an inherent outcome of the relevance inference process.

Collaborative Filtering Recommendation Systems

MSP: An FPGA-Specific Mixed-Scheme, Multi-Precision Deep Neural Network Quantization Framework

no code implementations16 Sep 2020 Sung-En Chang, Yanyu Li, Mengshu Sun, Weiwen Jiang, Runbin Shi, Xue Lin, Yanzhi Wang

To tackle the limited computing and storage resources in edge devices, model compression techniques have been widely used to trim deep neural network (DNN) models for on-device inference execution.

Edge-computing Image Denoising +2

Real-Time Execution of Large-scale Language Models on Mobile

no code implementations15 Sep 2020 Wei Niu, Zhenglun Kong, Geng Yuan, Weiwen Jiang, Jiexiong Guan, Caiwen Ding, Pu Zhao, Sijia Liu, Bin Ren, Yanzhi Wang

Our framework can guarantee the identified model to meet both resource and real-time specifications of mobile devices, thus achieving real-time execution of large transformer-based models like BERT variants.

Edge-computing

YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design

3 code implementations12 Sep 2020 Yuxuan Cai, Hongjia Li, Geng Yuan, Wei Niu, Yanyu Li, Xulong Tang, Bin Ren, Yanzhi Wang

In this work, we propose YOLObile framework, a real-time object detection on mobile devices via compression-compilation co-design.

Computational Efficiency Object +2

AntiDote: Attention-based Dynamic Optimization for Neural Network Runtime Efficiency

no code implementations14 Aug 2020 Fuxun Yu, ChenChen Liu, Di Wang, Yanzhi Wang, Xiang Chen

Based on the neural network attention mechanism, we propose a comprehensive dynamic optimization framework including (1) testing-phase channel and column feature map pruning, as well as (2) training-phase optimization by targeted dropout.

One for Many: Transfer Learning for Building HVAC Control

no code implementations9 Aug 2020 Shichao Xu, Yi-Xuan Wang, Yanzhi Wang, Zheng O'Neill, Qi Zhu

Traditional HVAC control methods are typically based on creating explicit physical models for building thermal dynamics, which often require significant effort to develop and are difficult to achieve sufficient accuracy and efficiency for runtime building control and scalability for field implementations.

Transfer Learning

RT3D: Achieving Real-Time Execution of 3D Convolutional Neural Networks on Mobile Devices

no code implementations20 Jul 2020 Wei Niu, Mengshu Sun, Zhengang Li, Jou-An Chen, Jiexiong Guan, Xipeng Shen, Yanzhi Wang, Sijia Liu, Xue Lin, Bin Ren

The vanilla sparsity removes whole kernel groups, while KGS sparsity is a more fine-grained structured sparsity that enjoys higher flexibility while exploiting full on-device parallelism.

Code Generation Model Compression

Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization

no code implementations22 Apr 2020 Wei Niu, Pu Zhao, Zheng Zhan, Xue Lin, Yanzhi Wang, Bin Ren

High-end mobile platforms rapidly serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications.

Compiler Optimization Style Transfer +1

A Unified DNN Weight Compression Framework Using Reweighted Optimization Methods

no code implementations12 Apr 2020 Tianyun Zhang, Xiaolong Ma, Zheng Zhan, Shanglin Zhou, Minghai Qin, Fei Sun, Yen-Kuang Chen, Caiwen Ding, Makan Fardad, Yanzhi Wang

To address the large model size and intensive computation requirement of deep neural networks (DNNs), weight pruning techniques have been proposed and generally fall into two categories, i. e., static regularization-based pruning and dynamic regularization-based pruning.

CoCoPIE: Making Mobile AI Sweet As PIE --Compression-Compilation Co-Design Goes a Long Way

no code implementations14 Mar 2020 Shaoshan Liu, Bin Ren, Xipeng Shen, Yanzhi Wang

Assuming hardware is the major constraint for enabling real-time mobile intelligence, the industry has mainly dedicated their efforts to developing specialized hardware accelerators for machine learning and inference.

A Privacy-Preserving-Oriented DNN Pruning and Mobile Acceleration Framework

no code implementations13 Mar 2020 Yifan Gong, Zheng Zhan, Zhengang Li, Wei Niu, Xiaolong Ma, Wenhao Wang, Bin Ren, Caiwen Ding, Xue Lin, Xiao-Lin Xu, Yanzhi Wang

Weight pruning of deep neural networks (DNNs) has been proposed to satisfy the limited storage and computing capability of mobile edge devices.

Model Compression Privacy Preserving

Efficient Training of Deep Convolutional Neural Networks by Augmentation in Embedding Space

no code implementations12 Feb 2020 Mohammad Saeed Abrishami, Amir Erfan Eshratifar, David Eigen, Yanzhi Wang, Shahin Nazarian, Massoud Pedram

However, fine-tuning a transfer model with data augmentation in the raw input space has a high computational cost to run the full network for every augmented input.

Data Augmentation Transfer Learning

SS-Auto: A Single-Shot, Automatic Structured Weight Pruning Framework of DNNs with Ultra-High Efficiency

no code implementations23 Jan 2020 Zhengang Li, Yifan Gong, Xiaolong Ma, Sijia Liu, Mengshu Sun, Zheng Zhan, Zhenglun Kong, Geng Yuan, Yanzhi Wang

Structured weight pruning is a representative model compression technique of DNNs for hardware efficiency and inference accelerations.

Model Compression

An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices

no code implementations ECCV 2020 Xiaolong Ma, Wei Niu, Tianyun Zhang, Sijia Liu, Sheng Lin, Hongjia Li, Xiang Chen, Jian Tang, Kaisheng Ma, Bin Ren, Yanzhi Wang

Weight pruning has been widely acknowledged as a straightforward and effective method to eliminate redundancy in Deep Neural Networks (DNN), thereby achieving acceleration on various platforms.

Code Generation Compiler Optimization

Embedding Compression with Isotropic Iterative Quantization

no code implementations11 Jan 2020 Siyu Liao, Jie Chen, Yanzhi Wang, Qinru Qiu, Bo Yuan

Continuous representation of words is a standard component in deep learning-based NLP models.

Image Retrieval Quantization +1

PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning

no code implementations1 Jan 2020 Wei Niu, Xiaolong Ma, Sheng Lin, Shihao Wang, Xuehai Qian, Xue Lin, Yanzhi Wang, Bin Ren

Weight pruning of DNNs is proposed, but existing schemes represent two extremes in the design space: non-structured pruning is fine-grained, accurate, but not hardware friendly; structured pruning is coarse-grained, hardware-efficient, but with higher accuracy loss.

Code Generation Model Compression

DARB: A Density-Aware Regular-Block Pruning for Deep Neural Networks

no code implementations19 Nov 2019 Ao Ren, Tao Zhang, Yuhao Wang, Sheng Lin, Peiyan Dong, Yen-Kuang Chen, Yuan Xie, Yanzhi Wang

As a further optimization, we propose a density-adaptive regular-block (DARB) pruning that outperforms prior structured pruning work with high pruning ratio and decoding efficiency.

Model Compression Network Pruning

Deep Compressed Pneumonia Detection for Low-Power Embedded Devices

no code implementations4 Nov 2019 Hongjia Li, Sheng Lin, Ning Liu, Caiwen Ding, Yanzhi Wang

Deep neural networks (DNNs) have been expanded into medical fields and triggered the revolution of some medical applications by extracting complex features and achieving high accuracy and performance, etc.

Pneumonia Detection

Adversarial T-shirt! Evading Person Detectors in A Physical World

1 code implementation ECCV 2020 Kaidi Xu, Gaoyuan Zhang, Sijia Liu, Quanfu Fan, Mengshu Sun, Hongge Chen, Pin-Yu Chen, Yanzhi Wang, Xue Lin

To the best of our knowledge, this is the first work that models the effect of deformation for designing physical adversarial examples with respect to-rigid objects such as T-shirts.

SPEC2: SPECtral SParsE CNN Accelerator on FPGAs

no code implementations16 Oct 2019 Yue Niu, Hanqing Zeng, Ajitesh Srivastava, Kartik Lakhotia, Rajgopal Kannan, Yanzhi Wang, Viktor Prasanna

On the other hand, weight pruning techniques address the redundancy in model parameters by converting dense convolutional kernels into sparse ones.

REQ-YOLO: A Resource-Aware, Efficient Quantization Framework for Object Detection on FPGAs

no code implementations29 Sep 2019 Caiwen Ding, Shuo Wang, Ning Liu, Kaidi Xu, Yanzhi Wang, Yun Liang

To achieve real-time, highly-efficient implementations on FPGA, we present the detailed hardware implementation of block circulant matrices on CONV layers and develop an efficient processing element (PE) structure supporting the heterogeneous weight quantization, CONV dataflow and pipelining techniques, design optimization, and a template-based automatic synthesis framework to optimally exploit hardware resource.

Model Compression object-detection +2

PCONV: The Missing but Desirable Sparsity in DNN Weight Pruning for Real-time Execution on Mobile Devices

no code implementations6 Sep 2019 Xiaolong Ma, Fu-Ming Guo, Wei Niu, Xue Lin, Jian Tang, Kaisheng Ma, Bin Ren, Yanzhi Wang

Model compression techniques on Deep Neural Network (DNN) have been widely acknowledged as an effective way to achieve acceleration on a variety of platforms, and DNN weight pruning is a straightforward and effective method.

Model Compression

An Ultra-Efficient Memristor-Based DNN Framework with Structured Weight Pruning and Quantization Using ADMM

no code implementations29 Aug 2019 Geng Yuan, Xiaolong Ma, Caiwen Ding, Sheng Lin, Tianyun Zhang, Zeinab S. Jalali, Yilong Zhao, Li Jiang, Sucheta Soundarajan, Yanzhi Wang

Memristor-based weight pruning and weight quantization have been seperately investigated and proven effectiveness in reducing area and power consumption compared to the original DNN model.

Quantization

Tiny but Accurate: A Pruned, Quantized and Optimized Memristor Crossbar Framework for Ultra Efficient DNN Implementation

no code implementations27 Aug 2019 Xiaolong Ma, Geng Yuan, Sheng Lin, Caiwen Ding, Fuxun Yu, Tao Liu, Wujie Wen, Xiang Chen, Yanzhi Wang

To mitigate the challenges, the memristor crossbar array has emerged as an intrinsically suitable matrix computation and low-power acceleration framework for DNN applications.

Model Compression Quantization

Protecting Neural Networks with Hierarchical Random Switching: Towards Better Robustness-Accuracy Trade-off for Stochastic Defenses

1 code implementation20 Aug 2019 Xiao Wang, Siyue Wang, Pin-Yu Chen, Yanzhi Wang, Brian Kulis, Xue Lin, Peter Chin

However, one critical drawback of current defenses is that the robustness enhancement is at the cost of noticeable performance degradation on legitimate data, e. g., large drop in test accuracy.

Adversarial Robustness

A Stochastic-Computing based Deep Learning Framework using Adiabatic Quantum-Flux-Parametron SuperconductingTechnology

no code implementations22 Jul 2019 Ruizhe Cai, Ao Ren, Olivia Chen, Ning Liu, Caiwen Ding, Xuehai Qian, Jie Han, Wenhui Luo, Nobuyuki Yoshikawa, Yanzhi Wang

Further, the application of SC has been investigated in DNNs in prior work, and the suitability has been illustrated as SC is more compatible with approximate computations.

AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates

no code implementations6 Jul 2019 Ning Liu, Xiaolong Ma, Zhiyuan Xu, Yanzhi Wang, Jian Tang, Jieping Ye

This work proposes AutoCompress, an automatic structured pruning framework with the following key performance improvements: (i) effectively incorporate the combination of structured pruning schemes in the automatic process; (ii) adopt the state-of-art ADMM-based structured weight pruning as the core algorithm, and propose an innovative additional purification step for further weight reduction without accuracy loss; and (iii) develop effective heuristic search method enhanced by experience-based guided search, replacing the prior deep reinforcement learning technique which has underlying incompatibility with the target pruning problem.

Model Compression

Non-Structured DNN Weight Pruning -- Is It Beneficial in Any Platform?

no code implementations3 Jul 2019 Xiaolong Ma, Sheng Lin, Shaokai Ye, Zhezhi He, Linfeng Zhang, Geng Yuan, Sia Huat Tan, Zhengang Li, Deliang Fan, Xuehai Qian, Xue Lin, Kaisheng Ma, Yanzhi Wang

Based on the proposed comparison framework, with the same accuracy and quantization, the results show that non-structrued pruning is not competitive in terms of both storage and computation efficiency.

Model Compression Quantization

Robust Sparse Regularization: Simultaneously Optimizing Neural Network Robustness and Compactness

no code implementations30 May 2019 Adnan Siraj Rakin, Zhezhi He, Li Yang, Yanzhi Wang, Liqiang Wang, Deliang Fan

In this work, we show that shrinking the model size through proper weight pruning can even be helpful to improve the DNN robustness under adversarial attack.

Adversarial Attack

Fault Sneaking Attack: a Stealthy Framework for Misleading Deep Neural Networks

no code implementations28 May 2019 Pu Zhao, Siyue Wang, Cheng Gongye, Yanzhi Wang, Yunsi Fei, Xue Lin

Despite the great achievements of deep neural networks (DNNs), the vulnerability of state-of-the-art DNNs raises security concerns of DNNs in many application domains requiring high reliability. We propose the fault sneaking attack on DNNs, where the adversary aims to misclassify certain input images into any target labels by modifying the DNN parameters.

Overall - Test

Brain-inspired reverse adversarial examples

no code implementations28 May 2019 Shaokai Ye, Sia Huat Tan, Kaidi Xu, Yanzhi Wang, Chenglong Bao, Kaisheng Ma

On contrast, current state-of-the-art deep learning approaches heavily depend on the variety of training samples and the capacity of the network.

Quantization

Interpreting and Evaluating Neural Network Robustness

no code implementations10 May 2019 Fuxun Yu, Zhuwei Qin, Chenchen Liu, Liang Zhao, Yanzhi Wang, Xiang Chen

Recently, adversarial deception becomes one of the most considerable threats to deep neural networks.

Adversarial Attack

26ms Inference Time for ResNet-50: Towards Real-Time Execution of all DNNs on Smartphone

no code implementations2 May 2019 Wei Niu, Xiaolong Ma, Yanzhi Wang, Bin Ren

With the rapid emergence of a spectrum of high-end mobile devices, many applications that required desktop-level computation capability formerly can now run on these devices without any problem.

Model Compression

Toward Extremely Low Bit and Lossless Accuracy in DNNs with Progressive ADMM

no code implementations2 May 2019 Sheng Lin, Xiaolong Ma, Shaokai Ye, Geng Yuan, Kaisheng Ma, Yanzhi Wang

Weight quantization is one of the most important techniques of Deep Neural Networks (DNNs) model compression method.

Model Compression Quantization

ResNet Can Be Pruned 60x: Introducing Network Purification and Unused Path Removal (P-RM) after Weight Pruning

no code implementations30 Apr 2019 Xiaolong Ma, Geng Yuan, Sheng Lin, Zhengang Li, Hao Sun, Yanzhi Wang

The state-of-art DNN structures involve high computation and great demand for memory storage which pose intensive challenge on DNN framework resources.

Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation

3 code implementations CVPR 2019 Hao Tang, Dan Xu, Nicu Sebe, Yanzhi Wang, Jason J. Corso, Yan Yan

In this paper, we propose a novel approach named Multi-Channel Attention SelectionGAN (SelectionGAN) that makes it possible to generate images of natural scenes in arbitrary viewpoints, based on an image of the scene and a novel semantic map.

Bird View Synthesis Cross-View Image-to-Image Translation +1

Machine Vision Guided 3D Medical Image Compression for Efficient Transmission and Accurate Segmentation in the Clouds

no code implementations CVPR 2019 Zihao Liu, Xiaowei Xu, Tao Liu, Qi Liu, Yanzhi Wang, Yiyu Shi, Wujie Wen, Meiping Huang, Haiyun Yuan, Jian Zhuang

In this paper we will use deep learning based medical image segmentation as a vehicle and demonstrate that interestingly, machine and human view the compression quality differently.

Image Compression Image Segmentation +3

Adversarial Robustness vs Model Compression, or Both?

1 code implementation29 Mar 2019 Shaokai Ye, Kaidi Xu, Sijia Liu, Jan-Henrik Lambrechts, huan zhang, Aojun Zhou, Kaisheng Ma, Yanzhi Wang, Xue Lin

Furthermore, this work studies two hypotheses about weight pruning in the conventional setting and finds that weight pruning is essential for reducing the network model size in the adversarial setting, training a small model from scratch even with inherited initialization from the large model cannot achieve both adversarial robustness and high standard accuracy.

Adversarial Robustness Model Compression +1

Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates using ADMM

2 code implementations23 Mar 2019 Shaokai Ye, Xiaoyu Feng, Tianyun Zhang, Xiaolong Ma, Sheng Lin, Zhengang Li, Kaidi Xu, Wujie Wen, Sijia Liu, Jian Tang, Makan Fardad, Xue Lin, Yongpan Liu, Yanzhi Wang

A recent work developed a systematic frame-work of DNN weight pruning using the advanced optimization technique ADMM (Alternating Direction Methods of Multipliers), achieving one of state-of-art in weight pruning results.

Model Compression Quantization

CircConv: A Structured Convolution with Low Complexity

no code implementations28 Feb 2019 Siyu Liao, Zhe Li, Liang Zhao, Qinru Qiu, Yanzhi Wang, Bo Yuan

Deep neural networks (DNNs), especially deep convolutional neural networks (CNNs), have emerged as the powerful technique in various machine learning applications.

E-RNN: Design Optimization for Efficient Recurrent Neural Networks in FPGAs

no code implementations12 Dec 2018 Zhe Li, Caiwen Ding, Siyue Wang, Wujie Wen, Youwei Zhuo, Chang Liu, Qinru Qiu, Wenyao Xu, Xue Lin, Xuehai Qian, Yanzhi Wang

It is a challenging task to have real-time, efficient, and accurate hardware RNN implementations because of the high sensitivity to imprecision accumulation and the requirement of special activation function implementations.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Interpreting Adversarial Robustness: A View from Decision Surface in Input Space

no code implementations ICLR 2019 Fuxun Yu, ChenChen Liu, Yanzhi Wang, Liang Zhao, Xiang Chen

One popular hypothesis of neural network generalization is that the flat local minima of loss surface in parameter space leads to good generalization.

Adversarial Robustness

Structured Adversarial Attack: Towards General Implementation and Better Interpretability

1 code implementation ICLR 2019 Kaidi Xu, Sijia Liu, Pu Zhao, Pin-Yu Chen, huan zhang, Quanfu Fan, Deniz Erdogmus, Yanzhi Wang, Xue Lin

When generating adversarial examples to attack deep neural networks (DNNs), Lp norm of the added perturbation is usually used to measure the similarity between original image and adversarial example.

Adversarial Attack

StructADMM: A Systematic, High-Efficiency Framework of Structured Weight Pruning for DNNs

1 code implementation29 Jul 2018 Tianyun Zhang, Shaokai Ye, Kaiqi Zhang, Xiaolong Ma, Ning Liu, Linfeng Zhang, Jian Tang, Kaisheng Ma, Xue Lin, Makan Fardad, Yanzhi Wang

Without loss of accuracy on the AlexNet model, we achieve 2. 58X and 3. 65X average measured speedup on two GPUs, clearly outperforming the prior work.

Model Compression

Adversarial Meta-Learning

no code implementations8 Jun 2018 Chengxiang Yin, Jian Tang, Zhiyuan Xu, Yanzhi Wang

Meta-learning enables a model to learn from very limited data to undertake a new task.

Meta-Learning

Towards Robust Training of Neural Networks by Regularizing Adversarial Gradients

no code implementations23 May 2018 Fuxun Yu, Zirui Xu, Yanzhi Wang, ChenChen Liu, Xiang Chen

In recent years, neural networks have demonstrated outstanding effectiveness in a large amount of applications. However, recent works have shown that neural networks are susceptible to adversarial examples, indicating possible flaws intrinsic to the network structures.

Towards Budget-Driven Hardware Optimization for Deep Convolutional Neural Networks using Stochastic Computing

no code implementations10 May 2018 Zhe Li, Ji Li, Ao Ren, Caiwen Ding, Jeffrey Draper, Qinru Qiu, Bo Yuan, Yanzhi Wang

Recently, Deep Convolutional Neural Network (DCNN) has achieved tremendous success in many machine learning applications.

Learning Topics using Semantic Locality

no code implementations11 Apr 2018 Ziyi Zhao, Krittaphat Pugdeethosapol, Sheng Lin, Zhe Li, Caiwen Ding, Yanzhi Wang, Qinru Qiu

The topic modeling discovers the latent topic probability of the given text documents.

Topic Models

A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers

3 code implementations ECCV 2018 Tianyun Zhang, Shaokai Ye, Kaiqi Zhang, Jian Tang, Wujie Wen, Makan Fardad, Yanzhi Wang

We first formulate the weight pruning problem of DNNs as a nonconvex optimization problem with combinatorial constraints specifying the sparsity requirements, and then adopt the ADMM framework for systematic weight pruning.

Image Classification Network Pruning

An ADMM-Based Universal Framework for Adversarial Attacks on Deep Neural Networks

no code implementations9 Apr 2018 Pu Zhao, Sijia Liu, Yanzhi Wang, Xue Lin

In the literature, the added distortions are usually measured by L0, L1, L2, and L infinity norms, namely, L0, L1, L2, and L infinity attacks, respectively.

Adversarial Attack

Structured Weight Matrices-Based Hardware Accelerators in Deep Neural Networks: FPGAs and ASICs

no code implementations28 Mar 2018 Caiwen Ding, Ao Ren, Geng Yuan, Xiaolong Ma, Jiayu Li, Ning Liu, Bo Yuan, Yanzhi Wang

For FPGA implementations on deep convolutional neural networks (DCNNs), we achieve at least 152X and 72X improvement in performance and energy efficiency, respectively using the SWM-based framework, compared with the baseline of IBM TrueNorth processor under same accuracy constraints using the data set of MNIST, SVHN, and CIFAR-10.

Efficient Recurrent Neural Networks using Structured Matrices in FPGAs

no code implementations20 Mar 2018 Zhe Li, Shuo Wang, Caiwen Ding, Qinru Qiu, Yanzhi Wang, Yun Liang

Recurrent Neural Networks (RNNs) are becoming increasingly important for time series-related applications which require efficient and real-time implementations.

Model Compression Time Series +1

C3PO: Database and Benchmark for Early-stage Malicious Activity Detection in 3D Printing

no code implementations20 Mar 2018 Zhe Li, Xiaolong Ma, Hongjia Li, Qiyuan An, Aditya Singh Rathore, Qinru Qiu, Wenyao Xu, Yanzhi Wang

It is of vital importance to enable 3D printers to identify the objects to be printed, so that the manufacturing procedure of an illegal weapon can be terminated at the early stage.

Action Detection Activity Detection +1

Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples

2 code implementations CVPR 2019 Zihao Liu, Qi Liu, Tao Liu, Nuo Xu, Xue Lin, Yanzhi Wang, Wujie Wen

Image compression-based approaches for defending against the adversarial-example attacks, which threaten the safety use of deep neural networks (DNN), have been investigated recently.

Classification General Classification +2

C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs

no code implementations14 Mar 2018 Shuo Wang, Zhe Li, Caiwen Ding, Bo Yuan, Yanzhi Wang, Qinru Qiu, Yun Liang

The previous work proposes to use a pruning based compression technique to reduce the model size and thus speedups the inference on FPGAs.

DeepN-JPEG: A Deep Neural Network Favorable JPEG-based Image Compression Framework

no code implementations14 Mar 2018 Zihao Liu, Tao Liu, Wujie Wen, Lei Jiang, Jie Xu, Yanzhi Wang, Gang Quan

To reduce the data storage and transfer overhead in smart resource-limited Internet-of-Thing (IoT) systems, effective data compression is a "must-have" feature before transferring real-time produced dataset for training or classification.

Data Compression General Classification +2

Model-Free Control for Distributed Stream Data Processing using Deep Reinforcement Learning

no code implementations2 Mar 2018 Teng Li, Zhiyuan Xu, Jian Tang, Yanzhi Wang

Specifically, we, for the first time, propose to leverage emerging Deep Reinforcement Learning (DRL) for enabling model-free control in DSDPSs; and present design, implementation and evaluation of a novel and highly effective DRL-based control framework, which minimizes average end-to-end tuple processing time by jointly learning the system environment via collecting very limited runtime statistics data and making decisions under the guidance of powerful Deep Neural Networks.

reinforcement-learning Reinforcement Learning (RL) +1

Systematic Weight Pruning of DNNs using Alternating Direction Method of Multipliers

1 code implementation15 Feb 2018 Tianyun Zhang, Shaokai Ye, Yi-Peng Zhang, Yanzhi Wang, Makan Fardad

We present a systematic weight pruning framework of deep neural networks (DNNs) using the alternating direction method of multipliers (ADMM).

Computational Efficiency

Image Dataset for Visual Objects Classification in 3D Printing

no code implementations15 Feb 2018 Hongjia Li, Xiaolong Ma, Aditya Singh Rathore, Zhe Li, Qiyuan An, Chen Song, Wenyao Xu, Yanzhi Wang

The rapid development in additive manufacturing (AM), also known as 3D printing, has brought about potential risk and security issues along with significant benefits.

Classification General Classification

Security Analysis and Enhancement of Model Compressed Deep Learning Systems under Adversarial Attacks

no code implementations14 Feb 2018 Qi Liu, Tao Liu, Zihao Liu, Yanzhi Wang, Yier Jin, Wujie Wen

In this work, we for the first time investigate the multi-factor adversarial attack problem in practical model optimized deep learning systems by jointly considering the DNN model-reshaping (e. g. HashNet based deep compression) and the input perturbations.

Adversarial Attack

An Area and Energy Efficient Design of Domain-Wall Memory-Based Deep Convolutional Neural Networks using Stochastic Computing

no code implementations3 Feb 2018 Xiaolong Ma, Yi-Peng Zhang, Geng Yuan, Ao Ren, Zhe Li, Jie Han, Jingtong Hu, Yanzhi Wang

However, in these works, the memory design optimization is neglected for weight storage, which will inevitably result in large hardware cost.

Deep Reinforcement Learning for Dynamic Treatment Regimes on Medical Registry Data

no code implementations28 Jan 2018 Ning Liu, Ying Liu, Brent Logan, Zhiyuan Xu, Jian Tang, Yanzhi Wang

This paper presents the first deep reinforcement learning (DRL) framework to estimate the optimal Dynamic Treatment Regimes from observational medical data.

reinforcement-learning Reinforcement Learning (RL)

Experience-driven Networking: A Deep Reinforcement Learning based Approach

no code implementations17 Jan 2018 Zhiyuan Xu, Jian Tang, Jingsong Meng, Weiyi Zhang, Yanzhi Wang, Chi Harold Liu, Dejun Yang

Modern communication networks have become very complicated and highly dynamic, which makes them hard to model, predict and control.

Continuous Control reinforcement-learning +1

FFT-Based Deep Learning Deployment in Embedded Systems

no code implementations13 Dec 2017 Sheng Lin, Ning Liu, Mahdi Nazemi, Hongjia Li, Caiwen Ding, Yanzhi Wang, Massoud Pedram

The large model size of DNNs, while providing excellent accuracy, also burdens the embedded platforms with intensive computation and storage.

speech-recognition Speech Recognition

A Memristor-Based Optimization Framework for AI Applications

no code implementations18 Oct 2017 Sijia Liu, Yanzhi Wang, Makan Fardad, Pramod K. Varshney

In addition to ADMM, implementation of a customized power iteration (PI) method for eigenvalue/eigenvector computation using memristor crossbars is discussed.

BIG-bench Machine Learning

Deep Reinforcement Learning: Framework, Applications, and Embedded Implementations

no code implementations10 Oct 2017 Hongjia Li, Tianshu Wei, Ao Ren, Qi Zhu, Yanzhi Wang

The recent breakthroughs of deep reinforcement learning (DRL) technique in Alpha Go and playing Atari have set a good example in handling large state and actions spaces of complicated control problems.

Cloud Computing Q-Learning +3

A Hierarchical Framework of Cloud Resource Allocation and Power Management Using Deep Reinforcement Learning

no code implementations13 Mar 2017 Ning Liu, Zhe Li, Zhiyuan Xu, Jielong Xu, Sheng Lin, Qinru Qiu, Jian Tang, Yanzhi Wang

Automatic decision-making approaches, such as reinforcement learning (RL), have been applied to (partially) solve the resource allocation problem adaptively in the cloud computing system.

Cloud Computing Decision Making +3

Hardware-Driven Nonlinear Activation for Stochastic Computing Based Deep Convolutional Neural Networks

no code implementations12 Mar 2017 Ji Li, Zihao Yuan, Zhe Li, Caiwen Ding, Ao Ren, Qinru Qiu, Jeffrey Draper, Yanzhi Wang

Recently, Deep Convolutional Neural Networks (DCNNs) have made unprecedented progress, achieving the accuracy close to, or even better than human-level perception in various tasks.

Theoretical Properties for Neural Networks with Weight Matrices of Low Displacement Rank

no code implementations ICML 2017 Liang Zhao, Siyu Liao, Yanzhi Wang, Zhe Li, Jian Tang, Victor Pan, Bo Yuan

Recently low displacement rank (LDR) matrices, or so-called structured matrices, have been proposed to compress large-scale neural networks.

SC-DCNN: Highly-Scalable Deep Convolutional Neural Network using Stochastic Computing

no code implementations18 Nov 2016 Ao Ren, Ji Li, Zhe Li, Caiwen Ding, Xuehai Qian, Qinru Qiu, Bo Yuan, Yanzhi Wang

Stochastic Computing (SC), which uses bit-stream to represent a number within [-1, 1] by counting the number of ones in the bit-stream, has a high potential for implementing DCNNs with high scalability and ultra-low hardware footprint.

Cannot find the paper you are looking for? You can Submit a new open access paper.