Search Results for author: Raghuraman Krishnamoorthi

Found 4 papers, 2 papers with code

Low-Precision Hardware Architectures Meet Recommendation Model Inference at Scale

no code implementations26 May 2021 Zhaoxia, Deng, Jongsoo Park, Ping Tak Peter Tang, Haixin Liu, Jie, Yang, Hector Yuen, Jianyu Huang, Daya Khudia, Xiaohan Wei, Ellie Wen, Dhruv Choudhary, Raghuraman Krishnamoorthi, Carole-Jean Wu, Satish Nadathur, Changkyu Kim, Maxim Naumov, Sam Naghshineh, Mikhail Smelyanskiy

We share in this paper our search strategies to adapt reference recommendation models to low-precision hardware, our optimization of low-precision compute kernels, and the design and development of tool chain so as to maintain our models' accuracy throughout their lifespan during which topic trends and users' interests inevitably evolve.

Recommendation Systems

Check-N-Run: A Checkpointing System for Training Deep Learning Recommendation Models

no code implementations17 Oct 2020 Assaf Eisenman, Kiran Kumar Matam, Steven Ingram, Dheevatsa Mudigere, Raghuraman Krishnamoorthi, Krishnakumar Nair, Misha Smelyanskiy, Murali Annavaram

While Check-N-Run is applicable to long running ML jobs, we focus on checkpointing recommendation models which are currently the largest ML models with Terabytes of model size.

Quantization Recommendation Systems

Quantizing deep convolutional networks for efficient inference: A whitepaper

3 code implementations21 Jun 2018 Raghuraman Krishnamoorthi

Per-channel quantization of weights and per-layer quantization of activations to 8-bits of precision post-training produces classification accuracies within 2% of floating point networks for a wide variety of CNN architectures.

Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.