no code implementations • 13 Apr 2022 • Joanna Yoo, Kuba Perlin, Siddhartha Rao Kamalakara, João G. M. Araújo
Modern large language models require distributed training strategies due to their size.
1 code implementation • 8 Oct 2020 • Aidan N. Gomez, Oscar Key, Kuba Perlin, Stephen Gou, Nick Frosst, Jeff Dean, Yarin Gal
Motivated by poor resource utilisation in the global setting and poor task performance in the local setting, we introduce a class of intermediary strategies between local and global learning referred to as interlocking backpropagation.