Search Results for author: Mohamed M. Sabry Aly

Found 5 papers, 0 papers with code

OPQ: Compressing Deep Neural Networks with One-shot Pruning-Quantization

no code implementations23 May 2022 Peng Hu, Xi Peng, Hongyuan Zhu, Mohamed M. Sabry Aly, Jie Lin

Numerous network compression methods such as pruning and quantization are proposed to reduce the model size significantly, of which the key is to find suitable compression allocation (e. g., pruning sparsity and quantization codebook) of each layer.

Quantization

Delving into Channels: Exploring Hyperparameter Space of Channel Bit Widths with Linear Complexity

no code implementations29 Sep 2021 Zhe Wang, Jie Lin, Xue Geng, Mohamed M. Sabry Aly, Vijay Chandrasekhar

We formulate the quantization of deep neural networks as a rate-distortion optimization problem, and present an ultra-fast algorithm to search the bit allocation of channels.

Quantization

PSRR-MaxpoolNMS: Pyramid Shifted MaxpoolNMS with Relationship Recovery

no code implementations CVPR 2021 Tianyi Zhang, Jie Lin, Peng Hu, Bin Zhao, Mohamed M. Sabry Aly

Unlike convolutions which are inherently parallel, the de-facto standard for NMS, namely GreedyNMS, cannot be easily parallelized and thus could be the performance bottleneck in convolutional object detection pipelines.

object-detection Object Detection

Towards Effective 2-bit Quantization: Pareto-optimal Bit Allocation for Deep CNNs Compression

no code implementations25 Sep 2019 Zhe Wang, Jie Lin, Mohamed M. Sabry Aly, Sean I Young, Vijay Chandrasekhar, Bernd Girod

In this paper, we address an important problem of how to optimize the bit allocation of weights and activations for deep CNNs compression.

Quantization

Dataflow-based Joint Quantization of Weights and Activations for Deep Neural Networks

no code implementations4 Jan 2019 Xue Geng, Jie Fu, Bin Zhao, Jie Lin, Mohamed M. Sabry Aly, Christopher Pal, Vijay Chandrasekhar

This paper addresses a challenging problem - how to reduce energy consumption without incurring performance drop when deploying deep neural networks (DNNs) at the inference stage.

Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.