no code implementations • 17 Jul 2024 • Rui Xie, Asad Ul Haq, Linsen Ma, Krystal Sun, Sanchari Sen, Swagath Venkataramani, Liu Liu, Tong Zhang
Recent studies have revealed that, during the inference on generative AI models such as transformer, the importance of different weights exhibits substantial context-dependent variations.
no code implementations • 6 Feb 2024 • Zhenyu Liu, Garrett Gagnon, Swagath Venkataramani, Liu Liu
Deep Neural Networks (DNNs) have revolutionized a wide range of industries, from healthcare and finance to automotive, by offering unparalleled capabilities in data analysis and decision-making.
no code implementations • 2 Oct 2022 • Jörg Henkel, Hai Li, Anand Raghunathan, Mehdi B. Tahoori, Swagath Venkataramani, Xiaoxuan Yang, Georgios Zervakis
In this work, we enlighten the synergistic nature of AxC and ML and elucidate the impact of AxC in designing efficient ML systems.
no code implementations • 16 Jun 2022 • Andrea Fasoli, Chia-Yu Chen, Mauricio Serrano, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Kailash Gopalakrishnan
We report on aggressive quantization strategies that greatly accelerate inference of Recurrent Neural Network Transducers (RNN-T).
no code implementations • 29 Sep 2021 • Sarada Krithivasan, Swagath Venkataramani, Sanchari Sen, Anand Raghunathan
This is because the efficacy of learning on interpolated inputs is reduced by the interference between the forward/backward propagation of their constituent inputs.
no code implementations • 27 Aug 2021 • Andrea Fasoli, Chia-Yu Chen, Mauricio Serrano, Xiao Sun, Naigang Wang, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Wei zhang, Zoltán Tüske, Kailash Gopalakrishnan
We investigate the impact of aggressive low-precision representations of weights and activations in two families of large LSTM-based architectures for Automatic Speech Recognition (ASR): hybrid Deep Bidirectional LSTM - Hidden Markov Models (DBLSTM-HMMs) and Recurrent Neural Network - Transducers (RNN-Ts).
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • NeurIPS 2020 • Chia-Yu Chen, Jiamin Ni, Songtao Lu, Xiaodong Cui, Pin-Yu Chen, Xiao Sun, Naigang Wang, Swagath Venkataramani, Vijayalakshmi Srinivasan, Wei zhang, Kailash Gopalakrishnan
Large-scale distributed training of Deep Neural Networks (DNNs) on state-of-the-art platforms is expected to be severely communication constrained.
no code implementations • 1 Jan 2021 • Sarada Krithivasan, Sanchari Sen, Swagath Venkataramani, Anand Raghunathan
The trend in the weight updates made to the transition layer across epochs is used to determine how the boundary betweenSGD and localized updates is shifted in future epochs.
no code implementations • NeurIPS 2020 • Xiao Sun, Naigang Wang, Chia-Yu Chen, Jiamin Ni, Ankur Agrawal, Xiaodong Cui, Swagath Venkataramani, Kaoutar El Maghraoui, Vijayalakshmi (Viji) Srinivasan, Kailash Gopalakrishnan
In this paper, we propose a number of novel techniques and numerical representation formats that enable, for the very first time, the precision of training systems to be aggressively scaled from 8-bits to 4-bits.
no code implementations • NeurIPS 2019 • Xiao Sun, Jungwook Choi, Chia-Yu Chen, Naigang Wang, Swagath Venkataramani, Vijayalakshmi (Viji) Srinivasan, Xiaodong Cui, Wei zhang, Kailash Gopalakrishnan
Reducing the numerical precision of data and computation is extremely effective in accelerating deep learning training workloads.
no code implementations • 17 Jul 2018 • Jungwook Choi, Pierce I-Jen Chuang, Zhuo Wang, Swagath Venkataramani, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan
Deep learning algorithms achieve high classification accuracy at the expense of significant computation cost.
3 code implementations • ICLR 2018 • Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I-Jen Chuang, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan
We show, for the first time, that both weights and activations can be quantized to 4-bits of precision while still achieving accuracy comparable to full precision networks across a range of popular models and datasets.
no code implementations • 7 Nov 2017 • Sanchari Sen, Shubham Jain, Swagath Venkataramani, Anand Raghunathan
SparCE consists of 2 key micro-architectural enhancements- a Sparsity Register File (SpRF) that tracks zero registers and a Sparsity aware Skip Address (SASA) table that indicates instructions to be skipped.
no code implementations • 4 Apr 2017 • Sanjay Ganapathy, Swagath Venkataramani, Balaraman Ravindran, Anand Raghunathan
Complementary to these approaches, DyVEDeep is a dynamic approach that exploits the heterogeneity in the inputs to DNNs to improve their compute efficiency with comparable classification accuracy.
no code implementations • 27 Feb 2016 • Syed Shakib Sarwar, Swagath Venkataramani, Anand Raghunathan, Kaushik Roy
Multipliers consume most of the processing energy in the digital neurons, and thereby in the hardware implementations of artificial neural networks.
no code implementations • 29 Sep 2015 • Priyadarshini Panda, Swagath Venkataramani, Abhronil Sengupta, Anand Raghunathan, Kaushik Roy
We propose a 2-stage hierarchical classification framework, with increasing levels of complexity, wherein the first stage is trained to recognize the broad representative semantic features relevant to the object of interest.