Search Results for author: Sheng Lin

Found 24 papers, 3 papers with code

Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates using ADMM

2 code implementations23 Mar 2019 Shaokai Ye, Xiaoyu Feng, Tianyun Zhang, Xiaolong Ma, Sheng Lin, Zhengang Li, Kaidi Xu, Wujie Wen, Sijia Liu, Jian Tang, Makan Fardad, Xue Lin, Yongpan Liu, Yanzhi Wang

A recent work developed a systematic frame-work of DNN weight pruning using the advanced optimization technique ADMM (Alternating Direction Methods of Multipliers), achieving one of state-of-art in weight pruning results.

Model Compression Quantization

Learning Dynamic Context Augmentation for Global Entity Linking

2 code implementations IJCNLP 2019 Xiyuan Yang, Xiaotao Gu, Sheng Lin, Siliang Tang, Yueting Zhuang, Fei Wu, Zhigang Chen, Guoping Hu, Xiang Ren

Despite of the recent success of collective entity linking (EL) methods, these "global" inference methods may yield sub-optimal results when the "all-mention coherence" assumption breaks, and often suffer from high computational cost at the inference stage, due to the complex search space.

Entity Disambiguation Entity Linking +1

Learning Topics using Semantic Locality

no code implementations11 Apr 2018 Ziyi Zhao, Krittaphat Pugdeethosapol, Sheng Lin, Zhe Li, Caiwen Ding, Yanzhi Wang, Qinru Qiu

The topic modeling discovers the latent topic probability of the given text documents.

Topic Models

FFT-Based Deep Learning Deployment in Embedded Systems

no code implementations13 Dec 2017 Sheng Lin, Ning Liu, Mahdi Nazemi, Hongjia Li, Caiwen Ding, Yanzhi Wang, Massoud Pedram

The large model size of DNNs, while providing excellent accuracy, also burdens the embedded platforms with intensive computation and storage.

speech-recognition Speech Recognition

A Hierarchical Framework of Cloud Resource Allocation and Power Management Using Deep Reinforcement Learning

no code implementations13 Mar 2017 Ning Liu, Zhe Li, Zhiyuan Xu, Jielong Xu, Sheng Lin, Qinru Qiu, Jian Tang, Yanzhi Wang

Automatic decision-making approaches, such as reinforcement learning (RL), have been applied to (partially) solve the resource allocation problem adaptively in the cloud computing system.

Cloud Computing Decision Making +3

ResNet Can Be Pruned 60x: Introducing Network Purification and Unused Path Removal (P-RM) after Weight Pruning

no code implementations30 Apr 2019 Xiaolong Ma, Geng Yuan, Sheng Lin, Zhengang Li, Hao Sun, Yanzhi Wang

The state-of-art DNN structures involve high computation and great demand for memory storage which pose intensive challenge on DNN framework resources.

Toward Extremely Low Bit and Lossless Accuracy in DNNs with Progressive ADMM

no code implementations2 May 2019 Sheng Lin, Xiaolong Ma, Shaokai Ye, Geng Yuan, Kaisheng Ma, Yanzhi Wang

Weight quantization is one of the most important techniques of Deep Neural Networks (DNNs) model compression method.

Model Compression Quantization

Non-Structured DNN Weight Pruning -- Is It Beneficial in Any Platform?

no code implementations3 Jul 2019 Xiaolong Ma, Sheng Lin, Shaokai Ye, Zhezhi He, Linfeng Zhang, Geng Yuan, Sia Huat Tan, Zhengang Li, Deliang Fan, Xuehai Qian, Xue Lin, Kaisheng Ma, Yanzhi Wang

Based on the proposed comparison framework, with the same accuracy and quantization, the results show that non-structrued pruning is not competitive in terms of both storage and computation efficiency.

Model Compression Quantization

Tiny but Accurate: A Pruned, Quantized and Optimized Memristor Crossbar Framework for Ultra Efficient DNN Implementation

no code implementations27 Aug 2019 Xiaolong Ma, Geng Yuan, Sheng Lin, Caiwen Ding, Fuxun Yu, Tao Liu, Wujie Wen, Xiang Chen, Yanzhi Wang

To mitigate the challenges, the memristor crossbar array has emerged as an intrinsically suitable matrix computation and low-power acceleration framework for DNN applications.

Model Compression Quantization

An Ultra-Efficient Memristor-Based DNN Framework with Structured Weight Pruning and Quantization Using ADMM

no code implementations29 Aug 2019 Geng Yuan, Xiaolong Ma, Caiwen Ding, Sheng Lin, Tianyun Zhang, Zeinab S. Jalali, Yilong Zhao, Li Jiang, Sucheta Soundarajan, Yanzhi Wang

Memristor-based weight pruning and weight quantization have been seperately investigated and proven effectiveness in reducing area and power consumption compared to the original DNN model.

Quantization

Deep Compressed Pneumonia Detection for Low-Power Embedded Devices

no code implementations4 Nov 2019 Hongjia Li, Sheng Lin, Ning Liu, Caiwen Ding, Yanzhi Wang

Deep neural networks (DNNs) have been expanded into medical fields and triggered the revolution of some medical applications by extracting complex features and achieving high accuracy and performance, etc.

Pneumonia Detection

DARB: A Density-Aware Regular-Block Pruning for Deep Neural Networks

no code implementations19 Nov 2019 Ao Ren, Tao Zhang, Yuhao Wang, Sheng Lin, Peiyan Dong, Yen-Kuang Chen, Yuan Xie, Yanzhi Wang

As a further optimization, we propose a density-adaptive regular-block (DARB) pruning that outperforms prior structured pruning work with high pruning ratio and decoding efficiency.

Model Compression Network Pruning

A SOT-MRAM-based Processing-In-Memory Engine for Highly Compressed DNN Implementation

no code implementations24 Nov 2019 Geng Yuan, Xiaolong Ma, Sheng Lin, Zhengang Li, Caiwen Ding

Thus, the footprint and power consumption of SOT-MRAM PIM can be reduced, while increasing the overall system throughput at the meantime, making our proposed ADMM-based SOT-MRAM PIM more energy efficiency and suitable for embedded systems or IoT devices.

Model Compression Quantization

PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning

no code implementations1 Jan 2020 Wei Niu, Xiaolong Ma, Sheng Lin, Shihao Wang, Xuehai Qian, Xue Lin, Yanzhi Wang, Bin Ren

Weight pruning of DNNs is proposed, but existing schemes represent two extremes in the design space: non-structured pruning is fine-grained, accurate, but not hardware friendly; structured pruning is coarse-grained, hardware-efficient, but with higher accuracy loss.

Code Generation Model Compression

An Image Enhancing Pattern-based Sparsity for Real-time Inference on Mobile Devices

no code implementations ECCV 2020 Xiaolong Ma, Wei Niu, Tianyun Zhang, Sijia Liu, Sheng Lin, Hongjia Li, Xiang Chen, Jian Tang, Kaisheng Ma, Bin Ren, Yanzhi Wang

Weight pruning has been widely acknowledged as a straightforward and effective method to eliminate redundancy in Deep Neural Networks (DNN), thereby achieving acceleration on various platforms.

Code Generation Compiler Optimization

Efficient Micro-Structured Weight Unification and Pruning for Neural Network Compression

no code implementations15 Jun 2021 Sheng Lin, Wei Jiang, Wei Wang, Kaidi Xu, Yanzhi Wang, Shan Liu, Songnan Li

Compressing Deep Neural Network (DNN) models to alleviate the storage and computation requirements is essential for practical applications, especially for resource limited devices.

Neural Network Compression

FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator

no code implementations16 Jun 2021 Geng Yuan, Payman Behnam, Zhengang Li, Ali Shafiee, Sheng Lin, Xiaolong Ma, Hang Liu, Xuehai Qian, Mahdi Nazm Bojnordi, Yanzhi Wang, Caiwen Ding

With weights stored in the ReRAM crossbar cells as conductance, when the input vector is applied to word lines, the matrix-vector multiplication results can be generated as the current in bit lines.

CAP-RAM: A Charge-Domain In-Memory Computing 6T-SRAM for Accurate and Precision-Programmable CNN Inference

no code implementations6 Jul 2021 Zhiyu Chen, Zhanghao Yu, Qing Jin, Yan He, Jingyu Wang, Sheng Lin, Dai Li, Yanzhi Wang, Kaiyuan Yang

A compact, accurate, and bitwidth-programmable in-memory computing (IMC) static random-access memory (SRAM) macro, named CAP-RAM, is presented for energy-efficient convolutional neural network (CNN) inference.

FAIVConf: Face enhancement for AI-based Video Conference with Low Bit-rate

no code implementations8 Jul 2022 Zhengang Li, Sheng Lin, Shan Liu, Songnan Li, Xue Lin, Wei Wang, Wei Jiang

Recently, high-quality video conferencing with fewer transmission bits has become a very hot and challenging problem.

Face Generation Face Swapping +1

Towards Zero Memory Footprint Spiking Neural Network Training

no code implementations16 Aug 2023 Bin Lei, Sheng Lin, Pei-Hung Lin, Chunhua Liao, Caiwen Ding

Our design is able to achieve a $\mathbf{58. 65\times}$ reduction in memory usage compared to the current SNN node.

Cannot find the paper you are looking for? You can Submit a new open access paper.