Search Results for author: Darshan Gandhi

Found 6 papers, 0 papers with code

Training Large Language Models Efficiently with Sparsity and Dataflow

no code implementations11 Apr 2023 Venkat Srinivasan, Darshan Gandhi, Urmish Thakker, Raghu Prabhakar

We show that we can successfully train GPT 13B to the same quality as the dense GPT 13B model, while achieving an end-end speedup of 4. 5x over dense A100 baseline.

Language Modelling Large Language Model +2

Cannot find the paper you are looking for? You can Submit a new open access paper.