1 code implementation • 29 Apr 2024 • Davis Wertheimer, Joshua Rosenkranz, Thomas Parnell, Sahil Suneja, Pavithra Ranganathan, Raghu Ganti, Mudhakar Srivatsa
This technical report describes the design and training of novel speculative decoding draft models, for accelerating the inference speeds of large language models in a production environment.
no code implementations • 3 Feb 2024 • Tianshi Wang, Jinyang Li, Ruijie Wang, Denizhan Kara, Shengzhong Liu, Davis Wertheimer, Antoni Viros-i-Martin, Raghu Ganti, Mudhakar Srivatsa, Tarek Abdelzaher
To incorporate sufficient diversity into the IoT training data, one therefore needs to consider a combinatorial explosion of training cases that are multiplicative in the number of objects considered and the possible environmental conditions in which such objects may be encountered.
no code implementations • 15 Jan 2024 • Adnan Hoque, Mudhakar Srivatsa, Chih-Chieh Yang, Raghu Ganti
In this paper, we present a novel method that reduces model inference latency during distributed deployment of Large Language Models (LLMs).
no code implementations • 5 Jan 2024 • Adnan Hoque, Less Wright, Chih-Chieh Yang, Mudhakar Srivatsa, Raghu Ganti
Our implementation shows improvement for the type of skinny matrix-matrix multiplications found in foundation model inference workloads.
1 code implementation • 28 Oct 2023 • Johannes Jakubik, Sujit Roy, C. E. Phillips, Paolo Fraccaro, Denys Godwin, Bianca Zadrozny, Daniela Szwarcman, Carlos Gomes, Gabby Nyirjesy, Blair Edwards, Daiki Kimura, Naomi Simumba, Linsong Chu, S. Karthik Mukkavilli, Devyani Lambhate, Kamal Das, Ranjini Bangalore, Dario Oliveira, Michal Muszynski, Kumar Ankur, Muthukumaran Ramasubramanian, Iksha Gurung, Sam Khallaghi, Hanxi, Li, Michael Cecil, Maryam Ahmadi, Fatemeh Kordi, Hamed Alemohammad, Manil Maskey, Raghu Ganti, Kommy Weldemariam, Rahul Ramachandran
This paper introduces a first-of-a-kind framework for the efficient pre-training and fine-tuning of foundational models on extensive geospatial data.
no code implementations • 19 Sep 2023 • S. Karthik Mukkavilli, Daniel Salles Civitarese, Johannes Schmude, Johannes Jakubik, Anne Jones, Nam Nguyen, Christopher Phillips, Sujit Roy, Shraddha Singh, Campbell Watson, Raghu Ganti, Hendrik Hamann, Udaysankar Nair, Rahul Ramachandran, Kommy Weldemariam
In particular, we are witnessing the rise of AI foundation models that can perform competitively on multiple domain-specific downstream tasks.