Search Results for author: Kuba Perlin

Found 2 papers, 1 papers with code

Scalable Training of Language Models using JAX pjit and TPUv4

no code implementations • 13 Apr 2022 • Joanna Yoo, Kuba Perlin, Siddhartha Rao Kamalakara, João G. M. Araújo

Modern large language models require distributed training strategies due to their size.

Paper
Add Code

Interlocking Backpropagation: Improving depthwise model-parallelism

1 code implementation • 8 Oct 2020 • Aidan N. Gomez, Oscar Key, Kuba Perlin, Stephen Gou, Nick Frosst, Jeff Dean, Yarin Gal

Motivated by poor resource utilisation in the global setting and poor task performance in the local setting, we introduce a class of intermediary strategies between local and global learning referred to as interlocking backpropagation.

Image Classification

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.