4 code implementations • 17 Nov 2022 • Yousef El-Kurdi, Jerry Quinn, Avirup Sil
We introduce a novel run-time method for significantly reducing the accuracy loss associated with quantizing BERT-like models to 8-bit integers.
no code implementations • 15 Nov 2019 • Michael P. Perrone, Haidar Khan, Changhoan Kim, Anastasios Kyrillidis, Jerry Quinn, Valentina Salapura
This paper presents a methodology for selecting the mini-batch size that minimizes Stochastic Gradient Descent (SGD) learning time for single and multiple learner problems.
no code implementations • NAACL 2018 • Jerry Quinn, Miguel Ballesteros
Neural machine translation has achieved levels of fluency and adequacy that would have been surprising a short time ago.