no code implementations • 29 Mar 2022 • Mayukh Das, Brijraj Singh, Harsh Kanti Chheda, Pawan Sharma, Pradeep NS
Designing suitable deep model architectures, for AI-driven on-device apps and features, at par with rapidly evolving mobile hardware and increasingly complex target scenarios is a difficult task.
no code implementations • 1 Jan 2021 • TEJPRATAP GVSL, Raja Kumar, Pradeep NS
The Hybrid-Quantization scheme determines the sensitivity of each layer for per-tensor and per-channel quantization, and thereby generates hybrid quantized models that are $10 - 20\%$ efficient in inference time while achieving same or better accuracy as compared to per-channel quantization.