Search Results for author: Vijay Korthikanti

Found 3 papers, 2 papers with code

Reducing Activation Recomputation in Large Transformer Models

3 code implementations10 May 2022 Vijay Korthikanti, Jared Casper, Sangkug Lym, Lawrence McAfee, Michael Andersch, Mohammad Shoeybi, Bryan Catanzaro

In this paper, we show how to significantly accelerate training of large transformer models by reducing activation recomputation.

Cannot find the paper you are looking for? You can Submit a new open access paper.