1 code implementation • 27 Mar 2024 • Elliot Bolton, Abhinav Venigalla, Michihiro Yasunaga, David Hall, Betty Xiong, Tony Lee, Roxana Daneshjou, Jonathan Frankle, Percy Liang, Michael Carbin, Christopher D. Manning
Models such as GPT-4 and Med-PaLM 2 have demonstrated impressive performance on a wide variety of biomedical NLP tasks.
1 code implementation • NeurIPS 2023 • Jacob Portes, Alex Trott, Sam Havens, Daniel King, Abhinav Venigalla, Moin Nadeem, Nikhil Sardana, Daya Khudia, Jonathan Frankle
Here, we introduce MosaicBERT, a BERT-style encoder architecture and training recipe that is empirically optimized for fast pretraining.
no code implementations • 29 Mar 2021 • Valentina Popescu, Abhinav Venigalla, Di wu, Robert Schreiber
While neural networks have been trained using IEEE-754 binary32 arithmetic, the rapid growth of computational demands in deep learning has boosted interest in faster, low precision training.
no code implementations • 2 Jul 2020 • Abhinav Venigalla, Atli Kosson, Vitaliy Chiley, Urs Köster
Neural network training is commonly accelerated by using multiple synchronized workers to compute gradient updates in parallel.
no code implementations • 25 Mar 2020 • Atli Kosson, Vitaliy Chiley, Abhinav Venigalla, Joel Hestness, Urs Köster
New hardware can substantially increase the speed and efficiency of deep neural network training.