no code implementations • 20 Jan 2024 • Sami Alabed, Daniel Belov, Bart Chrzaszcz, Juliana Franco, Dominik Grewe, Dougal Maclaurin, James Molloy, Tom Natan, Tamara Norman, Xiaoyue Pan, Adam Paszke, Norman A. Rink, Michael Schaarschmidt, Timur Sitdikov, Agnieszka Swietlik, Dimitrios Vytiniotis, Joel Wee
Training of modern large neural networks (NN) requires a combination of parallelization strategies encompassing data, model, or optimizer sharding.
no code implementations • 20 Dec 2022 • Amir Shaikhha, Mathieu Huot, Shabnam Ghasemirad, Andrew Fitzgibbon, Simon Peyton Jones, Dimitrios Vytiniotis
Automatic differentiation (AD) is a technique for computing the derivative of a function represented by a program.
no code implementations • 7 Oct 2022 • Sami Alabed, Dominik Grewe, Juliana Franco, Bart Chrzaszcz, Tom Natan, Tamara Norman, Norman A. Rink, Dimitrios Vytiniotis, Michael Schaarschmidt
Large neural network models are commonly trained through a combination of advanced parallelism strategies in a single program, multiple data (SPMD) paradigm.
no code implementations • 6 Dec 2021 • Michael Schaarschmidt, Dominik Grewe, Dimitrios Vytiniotis, Adam Paszke, Georg Stefan Schmid, Tamara Norman, James Molloy, Jonathan Godwin, Norman Alexander Rink, Vinod Nair, Dan Belov
The rapid rise in demand for training large neural network architectures has brought into focus the need for partitioning strategies, for example by using data, model, or pipeline parallelism.
no code implementations • 2 Dec 2021 • Norman A. Rink, Adam Paszke, Dimitrios Vytiniotis, Georg Stefan Schmid
In this paper we address the problem of redistributing multi-dimensional array data in SPMD computations, the most prevalent form of parallelism in deep learning.
no code implementations • 20 Oct 2021 • Ningning Xie, Tamara Norman, Dominik Grewe, Dimitrios Vytiniotis
We present a novel characterization of the mapping of multiple parallelism forms (e. g. data and model parallelism) onto hierarchical accelerator systems that is hierarchy-aware and greatly reduces the space of software-to-hardware mapping.
no code implementations • CVPR 2021 • Mateusz Malinowski, Dimitrios Vytiniotis, Grzegorz Swirszcz, Viorica Patraucean, Joao Carreira
How can neural networks be trained on large-volume temporal data efficiently?
1 code implementation • 6 Jun 2018 • Amir Shaikhha, Andrew Fitzgibbon, Dimitrios Vytiniotis, Simon Peyton Jones, Christoph Koch
We present a system for the automatic differentiation of a higher-order functional array-processing language.
1 code implementation • ICLR 2018 • Alexander L. Gaunt, Matthew A. Johnson, Maik Riechert, Daniel Tarlow, Ryota Tomioka, Dimitrios Vytiniotis, Sam Webster
Through an implementation on multi-core CPUs, we show that AMP training converges to the same accuracy as conventional synchronous training algorithms in a similar number of epochs, but utilizes the available hardware more efficiently even for small minibatch sizes, resulting in significantly shorter overall training times.
1 code implementation • NeurIPS 2016 • Osbert Bastani, Yani Ioannou, Leonidas Lampropoulos, Dimitrios Vytiniotis, Aditya Nori, Antonio Criminisi
Despite having high accuracy, neural nets have been shown to be susceptible to adversarial examples, where a small perturbation to an input can cause it to become mislabeled.