2 code implementations • NeurIPS 2019 • Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zach DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, Soumith Chintala
Deep learning frameworks have often focused on either usability or speed, but not both.
1 code implementation • 8 May 2018 • Maxim Milakov, Natalia Gimelshein
The Softmax function is ubiquitous in machine learning, multiple previous works suggested faster alternatives for it.
4 code implementations • 25 Feb 2016 • Minsoo Rhu, Natalia Gimelshein, Jason Clemons, Arslan Zulfiqar, Stephen W. Keckler
The most widely used machine learning frameworks require users to carefully tune their memory usage so that the deep neural network (DNN) fits into the DRAM capacity of a GPU.