Search Results for author: Vivek Menon

Found 1 papers, 0 papers with code

Efficient 8-Bit Quantization of Transformer Neural Machine Language Translation Model

no code implementations3 Jun 2019 Aishwarya Bhandare, Vamsi Sripathi, Deepthi Karkada, Vivek Menon, Sun Choi, Kushal Datta, Vikram Saletore

In this work, we quantize a trained Transformer machine language translation model leveraging INT8/VNNI instructions in the latest Intel$^\circledR$ Xeon$^\circledR$ Cascade Lake processors to improve inference performance while maintaining less than 0. 5$\%$ drop in accuracy.

Quantization Translation

Cannot find the paper you are looking for? You can Submit a new open access paper.