Search Results for author: Vivek Menon

Found 1 papers, 0 papers with code

Efficient 8-Bit Quantization of Transformer Neural Machine Language Translation Model

no code implementations • 3 Jun 2019 • Aishwarya Bhandare, Vamsi Sripathi, Deepthi Karkada, Vivek Menon, Sun Choi, Kushal Datta, Vikram Saletore

In this work, we quantize a trained Transformer machine language translation model leveraging INT8/VNNI instructions in the latest Intel$^\circledR$ Xeon$^\circledR$ Cascade Lake processors to improve inference performance while maintaining less than 0. 5$\%$ drop in accuracy.

Quantization Translation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.