no code implementations • NeurIPS 2020 • Xiao Sun, Naigang Wang, Chia-Yu Chen, Jiamin Ni, Ankur Agrawal, Xiaodong Cui, Swagath Venkataramani, Kaoutar El Maghraoui, Vijayalakshmi (Viji) Srinivasan, Kailash Gopalakrishnan
In this paper, we propose a number of novel techniques and numerical representation formats that enable, for the very first time, the precision of training systems to be aggressively scaled from 8-bits to 4-bits.
no code implementations • ICLR 2019 • Charbel Sakr, Naigang Wang, Chia-Yu Chen, Jungwook Choi, Ankur Agrawal, Naresh Shanbhag, Kailash Gopalakrishnan
Observing that a bad choice for accumulation precision results in loss of information that manifests itself as a reduction in variance in an ensemble of partial sums, we derive a set of equations that relate this variance to the length of accumulation and the minimum number of bits needed for accumulation.
no code implementations • 7 Dec 2017 • Chia-Yu Chen, Jungwook Choi, Daniel Brand, Ankur Agrawal, Wei zhang, Kailash Gopalakrishnan
Highly distributed training of Deep Neural Networks (DNNs) on future compute platforms (offering 100 of TeraOps/s of computational capacity) is expected to be severely communication constrained.
2 code implementations • 9 Feb 2015 • Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, Pritish Narayanan
Training of large-scale deep neural networks is often constrained by the available computational resources.