no code implementations • 25 Jun 2024 • Johnny Li, Saksham Consul, Eda Zhou, James Wong, Naila Farooqui, Yuxin Ye, Nithyashree Manohar, Zhuxiaona Wei, Tian Wu, Ben Echols, Sharon Zhou, Gregory Diamos
We use our findings to design a first generation model for removing hallucinations -- Lamini-1 -- that stores facts in a massive mixture of millions of memory experts that are retrieved dynamically.
1 code implementation • 29 Sep 2021 • Alexandros Karargyris, Renato Umeton, Micah J. Sheller, Alejandro Aristizabal, Johnu George, Srini Bala, Daniel J. Beutel, Victor Bittorf, Akshay Chaudhari, Alexander Chowdhury, Cody Coleman, Bala Desinghu, Gregory Diamos, Debo Dutta, Diane Feddema, Grigori Fursin, Junyi Guo, Xinyuan Huang, David Kanter, Satyananda Kashyap, Nicholas Lane, Indranil Mallick, Pietro Mascagni, Virendra Mehta, Vivek Natarajan, Nikola Nikolov, Nicolas Padoy, Gennady Pekhimenko, Vijay Janapa Reddi, G Anthony Reina, Pablo Ribalta, Jacob Rosenthal, Abhishek Singh, Jayaraman J. Thiagarajan, Anna Wuest, Maria Xenochristou, Daguang Xu, Poonam Yadav, Michael Rosenthal, Massimo Loda, Jason M. Johnson, Peter Mattson
Medical AI has tremendous potential to advance healthcare by supporting the evidence-based practice of medicine, personalizing patient treatment, reducing costs, and improving provider and patient experience.
no code implementations • ICLR 2019 • Newsha Ardalani, Joel Hestness, Gregory Diamos
A long-held conventional wisdom states that larger models train more slowly when using gradient descent.
no code implementations • 26 Feb 2019 • Jiayi Huang, Mostofa Patwary, Gregory Diamos
We show that recent innovations in deep reinforcement learning can effectively color very large graphs -- a well-known NP-hard problem with clear commercial applications.
no code implementations • 23 Oct 2018 • Mostofa Patwary, Milind Chabbi, Heewoo Jun, Jiaji Huang, Gregory Diamos, Kenneth Church
We show how Zipf's Law can be used to scale up language modeling (LM) to take advantage of more training data and more GPUs.
no code implementations • 27 Sep 2018 • Joel Hestness, Sharan Narang, Newsha Ardalani, Heewoo Jun, Hassan Kianinejad, Md. Mostofa Ali Patwary, Yang Yang, Yanqi Zhou, Gregory Diamos, Kenneth Church
As the pace of deep learning innovation accelerates, it becomes increasingly important to organize the space of problems by relative difficultly.
no code implementations • 20 Aug 2018 • Sercan O. Arik, Heewoo Jun, Gregory Diamos
We propose the multi-head convolutional neural network (MCNN) architecture for waveform synthesis from spectrograms.
no code implementations • 1 Dec 2017 • Joel Hestness, Sharan Narang, Newsha Ardalani, Gregory Diamos, Heewoo Jun, Hassan Kianinejad, Md. Mostofa Ali Patwary, Yang Yang, Yanqi Zhou
As DL application domains grow, we would like a deeper understanding of the relationships between training set size, computational scale, and model accuracy improvements to advance the state-of-the-art.
no code implementations • ICLR 2018 • Sharan Narang, Eric Undersander, Gregory Diamos
Even though sparse operations need less compute and memory relative to their dense counterparts, the speed-up observed by using sparse operations is less than expected on different hardware platforms.
9 code implementations • ICLR 2018 • Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory Diamos, Erich Elsen, David Garcia, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, Hao Wu
Using this approach, we can reduce the memory consumption of deep learning models by nearly 2x.
1 code implementation • NeurIPS 2017 • Sercan Arik, Gregory Diamos, Andrew Gibiansky, John Miller, Kainan Peng, Wei Ping, Jonathan Raiman, Yanqi Zhou
We introduce Deep Voice 2, which is based on a similar pipeline with Deep Voice 1, but constructed with higher performance building blocks and demonstrates a significant audio quality improvement over Deep Voice 1.
1 code implementation • 17 Apr 2017 • Sharan Narang, Erich Elsen, Gregory Diamos, Shubho Sengupta
Benchmarks show that using our technique model size can be reduced by 90% and speed-up is around 2x to 7x.
3 code implementations • ICML 2017 • Sercan O. Arik, Mike Chrzanowski, Adam Coates, Gregory Diamos, Andrew Gibiansky, Yongguo Kang, Xi-An Li, John Miller, Andrew Ng, Jonathan Raiman, Shubho Sengupta, Mohammad Shoeybi
We present Deep Voice, a production-quality text-to-speech system constructed entirely from deep neural networks.